Key Takeaways
- The current popular implementations of service meshes (Istio, Linkerd, Consul Connect, etc.) only cater to the request-response style synchronous communication between microservices
- For the advancement and adoption of service meshes, we believe that it is critical that they support event-driven or messaging-based communication
- There are two main architectural patterns for implementing messaging support within a service mesh; the protocol proxy sidecar, which is a proxy for all the inbound and outbound events from the consumer and producer; and the HTTP bridge sidecar which translates or transforms event-driven communication protocol to HTTP or similar protocol
- Regardless of the bridging pattern that is used, the sidecar can facilitate the implementation (and correction abstraction) of cross-functional features such as observability, throttling, tracing etc.
Read our ultimate guide to managing service-to-service communications in the era of microservices and cloud.
Read the guide
Service meshes are increasingly becoming popular as an essential technology and an architectural pattern on which to base microservices and cloud-native architecture. A service mesh is primarily a networking infrastructure component that allows you to offload the network communication logic from your microservices-based applications so that you can fully focus on the business logic of your service.
A service mesh is built around the concept of a proxy, which is colocated with the service as a sidecar. Although a service mesh is often advertised as a platform for any cloud-native application, popular implementations of service meshes (Istio/Envoy, Linkerd, etc.) currently only cater to the request/response style of synchronous communication between microservices. However, interservice communication takes place over a diverse set of patterns, such as request/response (HTTP, gRPC, GraphQL) and event-driven messaging (NATS, Kafka, AMQP) in most pragmatic microservices use cases. Since service-mesh implementations do not support event-driven communication, most of the commodity features that service meshes offer are only available for synchronous request/response service - event-driven microservices must support those features as part of the service code itself, which contradicts the very objective of service-mesh architecture.
It is critical that a service mesh supports event-driven communication. This article looks at the key aspects of supporting event-driven architecture in a service mesh and how existing service-mesh technologies are trying to address these concerns.
Implementing event-driven messaging
In a typical request/response synchronous messaging scenario, you will find a service (server) and a consumer (client) that invokes the service. The service-mesh data plane acts as the intermediary between the client and the service. In event-driven communication, the communication pattern is drastically different. An event producer asynchronously sends the events to an event broker, with no direct communication channel between the producer and consumer. The communication style can either be pub-sub (multiple consumers) or queue-based (single consumer), and depending on the style, the producer can send messages to either a topic or queue respectively.
The consumer decides to subscribe to a topic or a queue that resides in the event broker, which is fully decoupled from the producer. When there are new messages available for that topic or queue, the broker pushes those messages to the consumer.
There are a couple of ways to use service-mesh abstraction for event-driven messaging.
Protocol-proxy sidecar
The protocol-proxy pattern is built around the concept that all the event-driven communication channels should go through the service-mesh data plane (i.e., the sidecar proxy). To support event-driven messaging protocols such as NATS, Kafka, or AMQP, you need to build a protocol handler/filter specific to the communication protocol and add that to the sidecar proxy. Figure 1 shows the typical communication pattern for event-driven messaging with a service mesh.
[Click on the image to enlarge it]
Figure 1: Event-driven messaging with a service mesh
As most event-driven communication protocols are implemented on top of TCP, the sidecar proxy can have protocol handlers/filters built on top of TCP to specifically handle the abstractions required to support each of the various messaging protocols.
The producer microservice (Microservice A) has to send messages to the sidecar via the underlying messaging protocol (Kafka, NATS, AMQP, etc.), using the most simple code for the producer client while the sidecar handles most of the complexities related to the protocol. Similarly, the logic of the consumer service (Microservice B) is also quite simple while the complexity resides at the sidecar. The abstractions provided from the service mesh may change from protocol to protocol.
The Envoy team is currently working on implementing Kafka support for the Envoy proxy based on the above pattern. It is still work in progress, but you can track the progress at GitHub.
HTTP-bridge sidecar
Rather than using a proxy for the event-driven messaging protocol, we can build an HTTP bridge that can translate messages to/from the required messaging protocol. One of the key motivations for building this bridging pattern is that most of the event brokers offer REST APIs (e.g., the Kafka REST API) to consume and produce messages. As shown in figure 2, the existing microservices can transparently consume the underlying event broker’s messaging system simply by controlling the sidecar that bridges the two protocols. The sidecar proxy is primarily responsible for receiving HTTP requests and translating them into Kafka/NATS/AMQP/etc. messages and vice versa.
[Click on the image to enlarge it]
Figure 2: The HTTP bridge allows the service to communicate with the event broker via HTTP
Similarly, you can use the HTTP bridge to allow microservices based on Kafka/NATS/AMQP to communicate directly with HTTP (or other request/response messaging protocols) microservices as in figure 3. In this case, the sidecar receives Kafka/NATS/AMQP requests, forwards them as HTTP, and translates HTTP responses back to Kafka/NATS/AMQP. There are some ongoing efforts to add support for this pattern on Envoy and NATS (e.g., AMQP/HTTP Bridge and a NATS/HTTP bridge, both for Envoy).
[Click on the image to enlarge it]
Figure 3: The HTTP Bridge allows services based on event-driven messaging protocols to consume HTTP services
Although the HTTP-bridge pattern works for certain use cases, it is not strong enough to serve as the standard way of handling event-driven messaging in the service-mesh architecture because bridging event-driven messaging protocol with a request/response messaging protocol always has limits. It is more or less a workaround that might work for certain use cases.
Key capabilities of an event-driven service mesh
The capabilities of a conventional service mesh based on request/response-style messaging are somewhat different from the capabilities of a service mesh that supports messaging paradigms. Here are some of the unique capabilities a service mesh that supports event-driven messaging will offer:
- Consumer and producer abstractions - With most messaging systems, such as Kafka, the broker itself is quite abstract and simple (a dumb pipe in microservices context) and your services are smart endpoints (most of the smarts live in the producer or consumer code). This means that the producers or consumers must have a lot of messaging-protocol code alongside the business logic. With the introduction of a service mesh, you can offload such commodity features (e.g., partition rebalancing in Kafka) related to the messaging protocol to the sidecar and fully focus on the business logic in your microservice code.
- Message-delivery semantics - There are many message-delivery semantics such as "at most once", "at least once", "exactly once", etc. Depending on what the underlying messaging system supports, you can offload those tasks to the service mesh (this is analogous to supporting circuit breakers, timeouts, etc. in the request/response paradigm).
- Subscription semantics - You can also use the service-mesh layer to handle the subscription semantics, such as durable subscription of the consumer-side logic.
- Throttling - You can control and govern the message consumption limits (rate limiting) based on various parameters such as the number of messages, message size, etc.
- Service discovery (broker, topics, and queue discovery) - The service-mesh sidecar allows you to discover the broker location, topic, or queue name during message production and consumption. This involves handling different topic hierarchies and wildcards.
- Message validation - Validating messages that are used for event-driven messaging is becoming important because most of the messaging protocols such as Kafka, NATS, etc. are protocol agnostic. Hence message validation is a part of consumer or producer implementation. The service mesh can provide this abstraction so that a consumer or producer can offload the message validation. For example, if you use Kafka along with Avro for schema validation, you can use the sidecar to do the validation (i.e., fetch the schema from an external scheme registry such as Confluent and validate the message against that scheme). You can also used this to check messages for malicious content.
- Message compression - Certain event-based messaging protocols, such as Kafka, allow the data to be compressed by the producer, written in the compressed format to the server, and decompressed by the consumer. You can easily implement such capabilities at the sidecar-proxy level and control them at the service-mesh control plane.
- Security - You can secure the communication between the broker and consumers/producers by enabling TLS at the service-mesh sidecar level so that your producer and consumer implementations do not need to worry about secured communication and can communicate with the sidecar in plain text.
- Observability - As all communications take place over the service-mesh data plane, you can deploy metrics, tracing, and logging out of the box for all event-driven messaging systems.
About the Author
Kasun Indrasiri is the director of Integration Architecture at WSO2 and is an author/evangelist on microservices architecture and enterprise-integration architecture. He wrote the books Microservices for Enterprise (Apress) and Beginning WSO2 ESB (Apress). He is an Apache committer and has worked as the product manager and an architect of WSO2 Enterprise Integrator. He has presented at the O'Reilly Software Architecture Conference, GOTO Chicago 2019, and most WSO2 conferences. He attends most of the San Francisco Bay Area microservices meetups. He founded the Silicon Valley Microservice, APIs, and Integration meetup, a vendor-neutral microservices meetup in the Bay Area.