In a recent InfoQ podcast, Lin Sun and Neeraj Poddar discussed the release of Istio 1.5 and explored the future of the service mesh space. Topics covered included the Istio community's motivations for migrating to the “istiod” monolithic control plane, the addition of WebAssembly data plane (Envoy proxy) extension support, and the future of multi-cluster support within Istio.
Sun, senior technical staff member and master inventor at IBM, and Poddar, engineering lead and architect at Aspen Mesh, began the discussion by providing an overview of service mesh technology. When adopting a microservices-based architecture, a service mesh is one approach to providing service discovery, traffic management, and cross-cutting communication concerns. As noted by Sun:
[...] as people go into microservices and cloud native, we've observed that there's a common set of problems among these microservices that many people start trying to solve. For example, how am I going to connect to my microservices? How am I going to do retries? How am I going to observe my microservices? How am I going to secure the communication of my microservices?
The “data plane” component of a service mesh is a proxy that conditionally translates, forwards, and observes every network packet that flows to and from a service network endpoint. A “control plane” takes all the individual instances of the data plane (proxies) and turns them into a distributed system that can be visualized and controlled by an operator.
The data plane of most modern service mesh implementations run out-of-process as a proxy sidecar. The Envoy proxy is commonly used in many service mesh implementations, including Istio.
The recent release of Istio 1.5 saw the deployment packaging of the control plane move from a microservice-based approach to that of a monolithic implementation, named “istiod”. In a recent Istio blog post, "Introducing istiod: simplifying the control plane”, Craig Box, Kubernetes / Istio advocacy lead at Google Cloud, provided an overview of the history of the Istio control plane, and discussed both the cost of complexity and benefit of consolidation.
Christian Posta, global field CTO at Solo.io, has also published “Istio as an Example of When Not to Do Microservices”, which provided additional rationale for what is potentially quite a significant architectural change. Both Sun and Poddar agreed that the quality of the associated decision making and planning related to such a change was a sign of growth and maturity within the Istio community.
Istio 1.5 now supports data plane extensions written in WebAssembly (Wasm). These extensions can modify network requests and responses and perform out-of-band actions, such as authentication and authorization. Poddar commented:
It enables users to dynamically program [the data plane] with a much enhanced user experience. I'm really excited for where this lands. I think this can be a game changer, not just for Istio, but the entire proxy landscape.
The topic of open standards was also discussed in the podcast. Poddar argued that standardisations like the Service Mesh Interface (SMI) can add a lot of value, but the user requirements, common use cases, and the core abstractions of the underlying technology must be well understood.
Looking to the future, Sun noted that multi-cluster and mesh expansion (out-of-cluster) support is continually improving in Istio and many other service mesh implementations:
We already have a pretty rich support, I would say, around multi-clusters [...], you can have Istio maybe running in one cluster for the xDS serving, but on the other cluster you also have a lightweight Istio running but not doing xDS serving, just to manage the certificate and sidecar injection. In the future, we're actually going to involve a little bit more in this space.
Sun also mentioned that Istio has a replicated control plane and a multi-cluster pattern in which engineers can config heterogeneous clusters where the cluster doesn't have to be the same as far as the services deployed within the cluster. Troubleshooting can be challenging, however:
Sometimes if you ever run through a multi-cluster scenario, you will notice that if it fails, it's a little bit harder to troubleshoot today. That's an area we definitely want to look into, to provide more guidance and automation to our user
The podcast audio, show notes, and a full transcript, can be found in the article “Lin Sun and Neeraj Poddar on Istio, Wasm, and the Future of Service Mesh”.