BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage News KubeCon EU: Mercedes-Benz’s Migration From Pod Security Policies to Validation Admission Policies

KubeCon EU: Mercedes-Benz’s Migration From Pod Security Policies to Validation Admission Policies

This item in japanese

During KubeCon EU the Mercedes Benz team presented their migration journey from Pod Security Policies to Validation Admission Policies to secure their 1000+ Kubernetes clusters. The solution was chosen in favour of Kyverno due to its improved performance.

According to Tobias Giese and Tjark Rasche, both platform engineers from Mercedes-Benz Tech Innovation, the company's platform was conceived as "self-service". Software development teams can commission or decommission Kubernetes clusters on demand, without considering any low-level operations.

The initial cluster implementation used Pod Security Policies to enforce security during runtime. The Kubernetes API server requests were secured through Open Policy Agent (which ran as a static pod manifest) in corroboration with the validating and authorisation webhooks. Giese described this solution as "all-in-one" but confusing and "bug-prone".

Rasche enumerated the team’s requirements for a custom admission policy: It has to be very flexible in defining policies for multiple resources (Deployments, CronJobs, etc.), not only for Pods. It should also enable customers to use the desired tools seamlessly and allow resource mutation as "Kubernetes tends to have quite insecure defaults" (for example, the allowPrivilegeEscalation on a pod, which defaults to true).

Their migration wasn’t a straightforward journey: after initially trying Kyverno, a CNCF incubating project designed for policy management, they dismissed it as the API requests had response times of up to 11 seconds. This was twenty times slower than the performance of their initial solution. Nevertheless, they noted that the performance improved (in version 1.12.0) after one of their colleagues shared the benchmarks with the Kyverno maintainers (measured with Grafana’s k6). Regardless of the performance of their implementation, they mentioned the great developer experience of using Kyverno.

To move forward, they regrouped and decided to focus their efforts on three directions:

  • Implementing all policies using the current setup in OPA
  • Evaluate the performance of a custom controller proof-of-concept
  • Improve the k6 load testing suite to be closer to real-world scenarios

Considering that the Kubernetes 1.26 release was approaching and they didn’t want to remain behind with more than one version, they decided to migrate the control plane from version 1.24 to 1.25, followed by a migration to version 1.26. In a further step, the migration of the worker nodes was done straight from 1.24 to 1.26.

During the migration they discovered the "holly grail" – Validating Admission Policies -- released as alpha in Kubernetes 1.26. The newly discovered feature promised to allow custom admission logic, and improved performance thanks to the in-memory Abstract Syntax Tree while not requiring additional controllers.

Rasche presented a basic admission policy pointing out that the validation expression is written using Common Expression Language (CEL) a new performant-focused programming language conceived for use on critical code paths.

He also pointed out that complex policies tend to become an "unreadable mess". After starting to write the policies by hand, they soon came to generate them via a simple Helm setup. A downside of the approach is its high dependence on the policies that will be enforced.

The mutation requirement was implemented through a custom controller. Giese pointed out that a new enhancement request will allow the mutation admission to be implemented similarly to validation. The feature might be available in version 1.31 of Kubernetes. Given that the validation admission policies feature was released just in version 1.26, they needed to use different implementations for each of the Kubernetes versions deployed:

  • OPA with admission-controller (for both validation and mutation) for Kubernetes 1.25
  • VAP and admission-controller (for mutation) for Kubernetes 1.26

After Kyverno 1.12.0 was released, they have redone the performance measurement comparing all mentioned solutions. They also point out that Kyverno currently supports VAP generation.

They concluded their presentation by sharing their lessons learned. First, policy benchmarking and a suite of end-to-end tests will allow quick iterations without sacrificing performance or quality. Second, they mentioned the caveat of maintaining sub-resources (like pods or ephemeral containers), especially when implementing custom controllers. Finally, they concluded that a Kubernetes-native implementation of policies is feasible even if it comes with the cost of being an early adopter.

Note: Validation admission policies reached beta status in Kubernetes version 1.28.

About the Author

Rate this Article

Adoption
Style

BT