Key Takeaways
- The ambient sidecar-less data plane is designed to be transparent to your application, requiring no changes to your application. It also eliminates many of Istio’s sidecar application requirements.
- By focusing on a destination service-only waypoint proxy, the configuration only needs to contain details about the very limited resources that the waypoint proxy needs to connect to versus the dynamic resources to potentially connect to.
- The Istio community decided to implement the zero trust tunnel with Rust due to its natural fit for a high-performance network proxy along with rich libraries that support work stealing.
- Sidecars are not going away and will continue to play a significant role in many use cases.
- Sidecars could be used when a specific destination service configuration is needed that is more granular than a per-service account.
Istio ambient mesh introduces a new sidecar-less data plane option for Istio service mesh, with the goal to simplify application onboarding, increase incremental adoption, and reduce infrastructure costs for Istio mesh users. Ambient mesh supports both sidecar and sidecar-less dataplane architectures so you can choose either or both based on your applications’ needs. In Istio 1.16, sidecars have been enhanced to support HBONE (HTTP-Based Overlay Network Environment) so they can interoperate with sidecar-less applications via ztunnel (zero-trust tunnel, providing the secure overlay layer) or/and waypoint proxy (providing the Layer 7 processing layer) which also understand HBONE.
Advantages of Ambient Sidecar-Less
The biggest advantage of ambient is that it requires no change to your application, which is why it is called ambient. The ambient sidecar-less data plane is designed to be transparent to your application, e.g. there is no need to change your CI/CD pipeline for your application or restart your application whenever there is a new vulnerability from the data plane (either Envoy-based waypoint proxy or Rust-based ztunnel, see more below). In addition to requiring no change to your application, the sidecar-less data plane also broadens application support by eliminating many of Istio’s sidecar application requirements such as server-send-first protocols, an inability to support Kubernetes Jobs, or the list of reserved sidecar ports.
The two layers (secure overlay layer and L7 processing layer) data plane approach in ambient allows you to better incrementally adopt the ambient sidecar-less data plane versus the all-or-nothing injection of sidecars. You can start with the secure overlay layer while enjoying all the benefits brought by that layer such as mTLS with cryptographic identity, simple Layer 4 authorization policy, and telemetry. Without any L7 processing, the secure overlay layer dramatically reduces the attack surface and the frequency to update the data plane for CVEs and other patches. The two-layer architecture enables you to pay for only what you need and scale the service mesh data plane independently from your workload, which reduces infrastructure costs for you.
What’s New in Istio Ambient Development?
The Istio team is working hard to make ambient mesh part of the next Istio release, and we’ve set up the ztunnel and ambient project boards to track our progress and heartily welcome contributions from the community. All ambient mesh contributors meet every Wednesday at 1 pm ET to discuss new design docs or any concerns from our contributors. Below are the top two changes I want to highlight:
Rust-based ztunnel
When Istio ambient service mesh was announced on Sept 7, 2022, the ztunnel component was implemented using Envoy proxy, simply because we wanted to make Istio ambient mesh available for everyone to install and explore as early as possible. Shortly after the initial announcement, the community evaluated whether ztunnel should be continued using Envoy or rewritten from scratch in Rust with John Howard starting the Rust-based ztunnel project. A lot of thought went into how to simplify the Envoy-based ztunnel and remove the need for internal listeners, but in the end, the community decided on joining forces with the Rust-based ztunnel project due to the following:
- Rust is a natural fit for a high-performance, low-utilization network proxy. Ztunnel provides the secure overlay layer with a much-reduced functionality and attack surface so it is easier to write compared to a full-feature proxy.
- Rust has rich libraries to be utilized, including the Tokio async runtime.
- Rust has a defined CVE process for us to leverage.
- Last but not least, unlike Envoy, Rust natively supports work stealing via its Tokio library. This is important for ztunnel to reuse connections effectively.
To learn more about our decision for a Rust-based ztunnel versus an Envoy-based one, please refer to our thoughts explained in more detail in this blog.
Destination service only waypoint proxy
When Istio ambient service mesh was originally announced, the waypoint proxy configuration was slightly easier to understand than the ztunnel configuration because it handled only the workloads sharing the same service account, e.g. one waypoint proxy per service account. However, the waypoint proxy configuration was still very complex, since the source waypoint proxy is aware of all other services in the Kubernetes cluster regardless of those services being actual destination services.
Figure 1: Source waypoint proxy awareness of all other services (only sidecar-less services are shown here but they could also be sidecars of out-of-mesh services)
Sidecar resources, introduced in Istio v1.1, are commonly used in Istio environments to trim down Envoy sidecars’ configurations to what is necessarily required to improve Envoy sidecars’ performance and resource utilization. As we started to evaluate whether we would need to support the Sidecar resources for the waypoint proxy (which is also Envoy based), we realized we could trim the waypoint proxy’s configuration drastically by providing a destination service-only waypoint proxy.
By focusing on a destination service-only waypoint proxy, the waypoint proxy configuration only needs to contain details about the very limited dynamic clusters, endpoints, and routes that the waypoint proxy needs to connect to versus the dynamic clusters/endpoints/routes to potentially connect to any services in its running Kubernetes cluster. This change effectively eliminates the need for supporting the Sidecar resources for waypoint proxies, which also saves users from needing to manually configure the Sidecar resources.
Figure 2: Destination waypoint aware of the destination service(s) but not other services
For example, in my Kubernetes cluster, I have the sleep, helloworld, and httpbin applications deployed sidecar-less in the default namespace. I also have the httpbin application deployed with sidecars in the foo namespace.
Figure 3: helloworld, httpbin, and sleep applications deployed without sidecars, httpbin in the foo namespace deployed with a sidecar
Below is the routes configuration of the httpbin’s sidecar in the foo namespace, which is very similar to source waypoint proxy’s as both are aware of all the other services’ routes:
Figure 4: Routes config of httpbin’s sidecar
In comparison, below is the much-reduced routes configuration for httpbin’s waypoint proxy. Note there are no routes related to the helloworld or sleep applications or the httpbin application in the foo namespace. While dynamic routes are used as examples here, dynamic clusters and endpoints are also reduced for destination-only waypoint proxies when compared with sidecars.
Figure 5: Routes config of httpbin’s waypoint proxy
Destination service only waypoint proxy means that there won’t be any source waypoint proxy. Without a source waypoint proxy, what happens if your destination service doesn’t have a waypoint proxy, for example, an AWS Lambda service, and you want to add resilience when connecting to the destination service?
In this case, you’d need an egress gateway or dedicated proxy to handle egress traffic. What is nice is that this proxy would contain a trimmed-down list with you outlining which external services you need to connect to, without the bloated configuration issue mentioned earlier or needing to use the Sidecar resources, or networking.istio.io/exportTo
annotation in your destination service to trim the unnecessary configurations.
Ambient Sidecar-Less is Great, What about Sidecars?
Sidecars will not be going away soon - you can continue to use sidecars as you feel comfortable, or simply because you already have obtained all the required approvals from your security team. Even as ambient sidecar-less matures, I expect that sidecar will continue to play a significant role in the following use cases:
-
Source service requires specific client-side configuration
With destination service only waypoint, waypoint is acting like a gateway for the destination service(s), where the waypoint proxy implements the traffic management and policy enforcement functions. This also means all source services share the same enforcements, lacking the ability to configure specific client-side configuration overrides. In Istio’s VirtualService resource, you can use sourceLabels to configure fault injection or retry or timeout override that is specific to a given source; for example, adding HTTP fault injection only for client pods with label `env: prod`.
What if your specific source service wants to perform client overwrites on retry/timeout/fault injection/load balancer configurations? You can use sidecars where it offers the granular configuration override for each client so that your client doesn’t need to use the defaults provided by the destination service.
Figure 6: Source1 using a sidecar for its configuration overrides
-
Destination service requires destination workload-specific policies
Waypoint proxy is designed per service account or per namespace; what if you need a more granular configuration than a service account for services sharing the same service account? For example, you need specific Telemetry or WasmPlugin or RequestAuthentication or EnvoyFilter configuration for the Destination1 service, but not for the Destination 2 service, when they both share the same service account. You can continue to use sidecars when you need a specific destination service configuration that is more granular than a per service account. Alternatively, instead of running with a sidecar proxy, you could create a dedicated waypoint proxy for destination 1 with its own service account.
Figure 7: Destination service-specific policy is enforced on Destination 1 service using sidecars
-
Sidecar and sidecar-less can co-exist and interop
A starting boundary for sidecar vs sidecar-less is at namespace level, where you define one or more specific namespaces to be sidecar-less via the istio.io/dataplane-mode=ambient namespace label. When the sidecar injection label co-exists with the ambient sidecar-less label on the namespace, the sidecar injection label always wins. This design ensures you can easily migrate from sidecar to sidecar-less, or from sidecar-less to sidecar based on your specific business requirements.
The Future of Istio Ambient Mesh
A lot of exciting stuff is happening in the Istio community for ambient mesh. The ambient mesh has graduated out of the experimental branch and merged into the upstream main branch so that it can be easily installed with the upcoming Istio 1.18 or newer release. We are continuing to evolve ambient mesh to improve its performance, scalability, and debuggability, just as the above updates of Rust-based ztunnel and destination service-only waypoint proxy show. As the community works towards making ambient mesh production ready as the default in Istio, we invite you to be part of the journey, with your feedback or contribution to help shape the ambient mesh, in the ambient channel on the Istio Slack or GitHub.