Introduction
Estes: This is Phil Estes. I'm a Principal Engineer at AWS. My job is to help you understand and demystify the state of APIs in the container ecosystem. This is maybe a little more difficult task than usual because there's not one clear, overarching component when we talk about containers. There's runtimes. There's Kubernetes. There's OCI and runC. Hopefully, we'll look at these different layers and make it practical as well that you can see where APIs exist, how vendors and integrators are plugging into various aspects of how containers work in runtimes, and Kubernetes.
Developers, Developers, Developers
It's pretty much impossible to have a modern discussion about containers without talking about Docker. Docker came on the scene, 2013, definitely huge increased interest in use in 2014 mainly around developers. Developers love the concept, and the abstraction that Docker had put around these various set of Linux kernel capabilities, and fell in love really with this command line, simplicity of docker build, docker push, docker run. Of course, again, command lines can be scripted and automated, but it's important to know that this command line has always been a lightweight client.
The Docker engine itself is listening on a socket, and clearly defines an HTTP based REST API. Every command that you run in the Docker client is calling one or more of these REST APIs to actually do the work to start your container to pull or push an image to a registry. Usually, this is local. Again, to many early users of Docker, you just assumed that your docker run was instantly creating a process on your Linux machine or cloud instance, but it was really calling over this remote API. Again, on a Linux system would be local, but could be remote over TCP, or a much better way was added more recently to tunnel that over SSH if you really need to be remote from the Docker engine. The important fact here is that Docker has always been built around an API. That API has matured over the years.
APIs are where we enable integration and automation. It's great to have a command line, developers love it. As you mature your tooling and your security stack and your monitoring, the API has been the place where there have been other language clients created, Python API for Docker containers, and so on. Really, much of the enablement around vendor technology and runtime security tools, all these things have been enabled by that initial API that Docker created for the Docker engine.
What's Behind the Docker API?
It will be good for us to understand the key concepts that were behind that API. There are three really key concepts that I want us to start to understand, and we'll see how they affect even higher layer uses via other abstractions like Kubernetes today. The first one is what I'm going to call the heart of a container, and that's the JSON representation of its configuration. If you ever use the Docker inspect command, you've seen Docker's view of that. Effectively, you have things like the command to run, maybe some cgroups resource limits or settings, various things about the isolation level. Do you want its own PID namespace? Do you want the PID namespace of the host? Are you going to attach volumes, environment variables? All this is wrapped up in this configuration object. Around that is an image bundle. This has image metadata, the layers, the actual file system.
Many of you know that if you use a build tool or use something like docker build, it assembles layers of content that are usually used with a copy-on-write file system at runtime to assemble these layers into what you think of as the root file system of your image. This is what's built and pushed and pulled from registries. This image bundle has references to this configuration object and all the layers and possibly some labels or annotations. The third concept is not so much an object or another representation, but the actual registry protocol itself. This is again separate from the Docker API. There's an HTTP based API to talk to an image registry to query or inspect or push content to a remote endpoint. For many, in the early days, this equated to Docker Hub. There are many implementations of the distribution protocol today, and many hosted registries by effectively every cloud provider out there.
The Open Container Initiative (OCI)
The Open Container Initiative was created in 2015 to make sure that this whole space of containers and runtimes and registries didn't fragment into a bunch of different ideas about what these things meant, and to standardize effectively around these concepts we just discussed that record to the Docker API and the Docker implementation. That configuration we talked about became the runtime spec in the OCI. The image bundle became the core of what is now the image spec. That registry API, again, more recently, wasn't part of the initial chart of the OCI, has now been formalized into the distribution spec. You'll see that even though there are many other runtimes than Docker today, almost all of them are conformant to these three OCI specifications.
There's ways to check that and validate that. The OCI community continues to innovate and develop around these specifications. In addition, the OCI has a runtime implementation that can parse and understand that runtime spec and turn it into an isolated process on Linux. That implementation, many of you would know as runC. runC was created out of some of the core underlying operating system interfaces that were in the Docker engine. They were brought out of the engine, contributed to the OCI, and became runC today. Many of you might recognize the term libcontainer, most of that libcontainer code base is what became runC.
What about an API for Containers?
At this point, you might say, I understand about the OCI specs and the standardization that's happened, but I still don't see a common API for containers. You'd be correct. The OCI did not create a standardized API for container lifecycle. The runC command line may be a de facto standard. There have been other implementations of the runC command line, therefore allowing someone to replace runC at the bottom of a container stack and have other capabilities. That's not really a clearly defined API for containers. So far, all we've seen is that Docker has an API, and we now have some standards around those core concepts and principles that allow there to be commonality and interoperability among various runtimes. Before we try and answer this question, we need to go a bit further in our journey and talk a little bit more than just about container runtimes.
We can see that Docker provided a solid answer for handling the container lifecycle on a single node. Almost as soon as Docker became popular, the use of containers in production showed that really at scale, users needed ways to orchestrate containers. Just as fast as Docker had become popular, now there are a bunch of popular orchestration ideas, everything from Nomad, to Mesos, to Kubernetes, and Docker even creating Docker Swarm to offer their own ideas about orchestration. Really, at this point, we have to dive into what it means to orchestrate containers, and not just talk about running containers on a single node.
Kubernetes
While it might be fun to dive in and try and talk about the pros and cons of various ideas that were hashed around during the "orchestration wars," effectively, we only have time to discuss Kubernetes, the heavyweight in the room. The Cloud Native Computing Foundation was formed around Kubernetes as its first capstone project. We know the use of Kubernetes is extremely broad in our industry. It continues to gain significant amounts of investment from cloud providers, from integrations of vendors of all kinds. The CNCF landscape continues to grow dramatically, year-over-year. Our focus is going to be on Kubernetes just given that and the fact that we're continuing to dive into what are the common APIs and API use around containers.
When we talk about orchestration, it really makes sense to talk about Kubernetes. There's two key aspects since we're talking about APIs that I'd like for us to understand. One coming from the client side is the Kubernetes API. We're showing one piece of the broader Kubernetes control plane known as the API server. That API server has an endpoint that listens for the Kubernetes API, again, a REST API over HTTP. Many of you, if you're a Kubernetes user would use it via the kubectl tool. You could also curl that endpoint or use other tools, which have been written to talk to the Kubernetes API server.
At the other end of the spectrum, I want to talk a little bit more about how the kubelet, this node specific daemon that's listening to the API server for the placement of actual containers and pods. We're going to talk about how the kubelet talks to an actual container runtime, and that happens over gRPC. Any container runtime that wants to be plugged into Kubernetes implements something known as the container runtime interface.
Kubernetes API
First, let's talk a little bit more about the Kubernetes API. This API server is really a key component of the control plane, and how clients and tools interact with the Kubernetes objects. We've already mentioned, it's a REST API over HTTP. You probably recognize if you've been around Kubernetes, or even gone to a 101 Kubernetes talk or workshop, there are a set of common objects, things like pods, and services, and daemon sets, and many others, these are all represented in a distributed database. The API is how you handle operations, create an update, and delete. The rest of the Kubernetes ecosystem is really using various watchers and reconcilers to handle the operational flow for how these deployments or pods actually end up on a node. The power of Kubernetes is really the extensibility of this declarative state system. If you're not happy with the abstractions given to you, some of these common objects I just talked about, you can create your own custom resource objects, they're going to lay into that same distributed database. You can create custom controllers to handle operations on those.
Kubernetes: The Container Runtime Interface (CRI)
As we saw in the initial diagram, the Kubernetes cluster is made up of multiple nodes, and on each node is a piece of software called the kubelet. The kubelet, again, is listening for state changes in the distributed database, and is looking to place pods and deployments on to the local node when instructed to do so by the orchestration layer. The kubelet doesn't run containers itself, it needs a container runtime. Initially, when Kubernetes was created, it used Docker as the runtime. There was a piece of software called the dockershim part of the kubelet that implemented this interface between the kubelet and Docker. That implementation has been deprecated and will be removed in the upcoming release of Kubernetes later this month. What you have left is the container runtime interface created several years ago as a common interface so that any compliant container runtime could serve as the kubelet.
If you think about it, the CRI is really the only common API for runtimes we have today. We talked about this earlier that Docker had an API. Containerd, the project I'm a maintainer of, we have a Go API as well as the gRPC API to our services. CRI-O, Podman, Singularity, there are many other runtimes out there across the ecosystem. CRI is really providing a common API, although truly, the CRI is not really used outside of the Kubernetes ecosystem today. Instead of being a common API endpoint that you could use anywhere in the container universe, CRI really tends to only be used in the Kubernetes ecosystem and pairs with other interfaces like CNI for networking and CSI for storage. If you do implement the CRI, say you're going to create a container runtime and you want to plug into Kubernetes. It's not just enough to represent containers, there's the idea of a pod and a pod sandbox. These are represented in the definition of the CRI gRPC interfaces. You can look those up on GitHub, and see exactly what interfaces you have to implement to be a CRI compliant runtime.
Kubernetes API Summary
Let's briefly summarize what we've seen as we've looked at Kubernetes from an API perspective. Kubernetes has a client API that reflects this Kubernetes object model. It's a well-defined API that's versioned. It uses REST over HTTP. Tools like kubectl use that API. When we talk about how container runtimes are driven from the kubelet, this uses gRPC defined interfaces known as the container runtime interface. Hearkening back to almost the beginning of our talk, when we actually talk about containers and images that are used by these runtimes, these are OCI compliant. That's important because fitting into the broader container ecosystem, there's interoperability between these runtimes because of the OCI specs. If you look at the pod specification in Kubernetes, some of those flags and features that you would pass to a container represent settings in the OCI runtime spec, for example. When you define an image reference, how that's pulled from a registry uses the OCI distribution API. That summarizes briefly both ends of the spectrum of the Kubernetes API that we've looked at.
Common API for Containers?
Coming back to our initial question, have we found that common API for containers? Maybe in some ways, if we're talking in the context of Kubernetes, the CRI is that well defined common API that abstracts away container runtime differences. It's not used outside of Kubernetes, and so therefore, we still have other APIs and other models of interacting with container lifecycles when we're not in the Kubernetes ecosystem. However, the CRI API is providing a valuable entry point for integrations and automation in the Kubernetes context. For example, tools maybe from Sysdig, or Datadog, or Aqua Security or others can use that CRI endpoint. Similar to how in the pre-Kubernetes world they might have used the Docker Engine API endpoint, to gather information about what containers are running or provide other telemetry and security information, coalesce maybe with eBPF tools or other things that those agents are running on your behalf. Again, maybe we're going to have to back away from the hope that we would find a common API that covers the whole spectrum of the container universe, and go back to a moniker that Docker used at the very dawn of the container era.
Build, Ship, Run (From an API Perspective)
As you well know, no talk on containers is complete without the picture of a container ship somewhere. That shipping metaphor has been used to good effect by Docker throughout the last several years. One of those monikers that they've used throughout that era has been build, ship, and run. It's a good representation of the phases of development in which containers are used. Maybe instead of trying to find that one overarching API, we should think about for each of these steps in the lifecycle of moving containers from development to production, where do APIs exist? How would you use them? Given your role, where does it make sense? We're going to take that aspect of APIs from here on out, and hopefully make it practical to understand where you should be using what APIs from the container ecosystem.
Do APIs Exist for Build, Ship, and Run?
Let's dive in and look briefly at build, ship, and run as they relate to APIs or standardization that may be available in each of those categories. First, let's look at build. Dockerfile itself, the syntax of how Dockerfiles are put together, has never been standardized in a formal way, but effectively has become a de facto standard. Dockerfile is not the only way to produce a container image. It might be the most traditional and straightforward manner, but there's a lot of tooling out there assembling container images without using Dockerfiles. Of course, the lack of a formal API for build is not necessarily a strong requirement in this space, because teams tend to adopt tools that match the requirements for that organization.
Maybe there's already a traditional Jenkins cluster, maybe they have adopted GitLab, or are using GitHub Actions, or other hosted providers, or even vendor tools like Codefresh. What really matters is that the output of these tools is a standard format. We've already talked about OCI and the image format and the registry API, which we'll talk about under ship. It really doesn't matter what the inputs are, what those build tools are, the fact that all these tools are producing OCI compliant images that can be shipped to OCI compliant registries is the standardization that has become valuable for the container ecosystem.
Of course, build ties very closely to ship, because as soon as I assemble an image, I want to put it in a registry. Here, we have the most straightforward answer. Yes, the registry and distribution protocol is an OCI standard today. We talked about that, and how it came to be coming out of the original Docker registry protocol. Pushing and pulling images and related artifacts is standardized, and the API is stable and well understood. There's still some unique aspects to this around authentication that is not part of the standard. At least the core functionality of pushing an image reference and all its component parts to a registry is part of that standard.
When we talk about run, we're going to have to really talk in two different aspects. When we talk about Kubernetes, the Kubernetes API is clearly defined and well adopted by many tools and organizations. When we step down to that runtime layer, as we've noted, only the formats are standardized there, so the OCI runtime spec and image spec. We've already noted the CRI is the common factor among major runtimes built around those underlying OCI standard types. That does give us commonality in the Kubernetes space, but not necessarily at the runtime layer itself.
Build
Even though I just said that using a traditional Dockerfile is not the only way to generate a container image, this use of base images and Dockerfiles, and the workflow around that remains a significant part of how people build images today. This is encoded into tools like Docker build, BuildKit, which is effectively replacing Docker build with its own implementation, but also used by many other tools. Buildah from Red Hat and many others, continue to provide and enhance this workflow of Dockerfile base images, adding content. The API in this model is really that Dockerfile syntax. BuildKit has actually been providing revisions of the Dockerfile, in effect its own standard and adding new features. There are interesting new innovations that have been announced even in the past few weeks.
If you're looking for tools that combine these build workflows with Kubernetes deployments and development models, they're definitely more than the few ones in the list. You can look at Skaffold, or Tekton, or Kaniko. Again, many other vendor tools that integrate ideas like GitOps and CI/CD with these traditional build operations of getting your container images assembled. There are a few interesting projects out there that may be worth looking at, Ko. If you're writing in Go, maybe writing microservices that you just want static Go binaries on a very slim base, ko can do that for you, even build multi-arch images, and integrates push, and integrates with many other tools.
Buildpacks, which has been contributed to the CNCF, coming out of some of the original work in Cloud Foundry, brings interesting ideas about replacing those base layers without having to rebuild the whole image. BuildKit has been adding some interesting innovations. Actually, just have a recent blog post about a very similar idea using Dockerfile. Then, dagger.io, a new project from Solomon Hykes and some of his early founders from Docker, are looking at providing some new ideas around CI/CD, again, integrating with Kubernetes and other container services. Providing a pipeline for build, CI/CD, and update of images.
Ship
For ship, there's already a common registry distribution API and a common format, the OCI image spec. Many build tools handle the ship step already by default. They can ship images to any OCI compliant registry. All the build tools we just talked about support pushing up to cloud services like ECR or GCR, an on-prem registry or self-hosted registry. The innovations here will most likely come via artifact support. One of the hottest topics in this space is image signing. You've probably heard of projects like cosign and sigstore, and the Notary v2 efforts.
There's a lot of talk about secure supply chain, and so software bill of materials is another artifact type that aligns with your container image. Then there's ideas about bundling. It's not just by image, but Helm charts or other artifacts that might go along with my image. These topics are being collaborated on in various OCI and CNCF working groups. Therefore, hopefully, this will lead to common APIs and formats, and not a unique set of tools that will all operate slightly differently. Again, ship has maybe our clearest sense of common APIs, common formats, and it continues to do so even with some of the innovations around artifacts and signing.
Run - User/Consumer
For the run phase, we're going to split our discussion along two axes, one as a user or a consumer, and the other as a builder or a vendor. On the user side, your main choice is going to be Kubernetes, or something else. With Kubernetes, you'll have options for additional abstractions, or not just whether you depend on a managed service from a cloud provider or roll your own, but even higher layer abstractions around PaaSs like Knative, or OpenFaaS, or Cloud Foundry, which also is built around Kubernetes.
No matter your choice here, the APIs will be common across these tools, and there'll be a breadth of integrations that you can pick from because of the size and scale of the CNCF and Kubernetes ecosystem. Maybe Kubernetes won't be the choice based on your specific needs. You may choose some non-Kubernetes orchestration model, maybe one of the major cloud providers, Fargate, or Cloud Run, or maybe cycle.io, or HashiCorp's Nomad. Again, ideas that are built around Kubernetes, but provide some of those same capabilities. In these cases, obviously, you'll be adopting the API and the tools and the structure of that particular orchestration platform.
Run - Builder/Vendor
As a builder or vendor, again, maybe you'll have the option to stay within the Kubernetes or CNCF ecosystem. You'll be building or extending or integrating with the Kubernetes API and its control plane, again, giving you a common API entry point. The broad adoption means you'll have lots of building blocks and other integrations to work with. If you need to integrate with container runtimes, we've already talked about the easy path within the Kubernetes context of just using the CRI.
The CRI has already abstracted you away from having to know details about the particular runtime providing the CRI. If you need to integrate at a lower point, for more than one runtime, we've already talked about there not being any clean option for that. Maybe there's a potential for you to integrate at the lowest layer of the stack, runC or using OCI hooks. There are drawbacks there as well, because maybe there'll be integration with microVMs like Kata Containers or Firecracker, which may prevent you from having the integration you need at that layer.
Decision Points
Hopefully, you've seen some of the tradeoffs and pros and cons of decisions you'll need to make either as someone building tools for the space or needing to adopt a platform, or trying to understand how to navigate the container space.
Here's a summary of a few decision points. First of all, the Docker engine and its API are still a valid single node solution for developers. There's plenty of tools and integrations. It's been around for quite a while. We haven't even talked about Docker Compose, which is still very popular, and has plenty of tools built around it, so much so that Podman from Red Hat, has also implemented the Docker API and added compose support. Alternatively, containerd, which really was created as an engine to be embedded, without really a full client, now has a client project called nerdctl that also has been adding compose support and providing some of the similar client experiences without the full Docker engine.
Of course, we've already seen that Kubernetes really provides the most adopted platform in this space, both for tools and having a common API. This allows for broad standardization, so, tools, interoperability, used in both development and production. There's a ton going on in this space, and I assume, and believe that will continue. It's also worth noting that even though we've shown that there's no real common API outside of the Kubernetes ecosystem for containers, most likely, as you know, you're going to adopt other APIs adjacent even to your Kubernetes use, or container tools that you might adopt. You're going to choose probably a cloud provider, an infrastructure platform. You're going to use other services around storage and networking. There will always be a small handful of APIs, even if we could come into a perfect world where we defined a clear and common API for containers.
The API Future
What about the future? I think it's pretty easy to say that significant innovation around runtimes and the APIs around them will stay in Kubernetes because of the breadth of adoption, and the commonality provided there. For example, SIG-Node, the special interest group in Kubernetes, focused on the node that includes the kubelet software and its components and the OCI communities, are really providing innovations that cross up through the stack to enhance capabilities. For example, there have been Kubernetes enhancement proposals still in flight for user namespaces, checkpoint/restore, swap support.
As these features are added, they drive this commonality up through being exposed in the CRI, and also implemented by the teams managing the runtimes themselves. You get to adopt new container capabilities all through the common CRI API and the runtimes and the OCI communities that deal with the specifications, do that work to make it possible to have a single interface to these new capabilities.
There will probably never be a clear path to commonality at the runtimes themselves. Effectively at this moment, you have two main camps. You've got Docker, dependent on Containerd and runC, and you have CRI-O, and Podman, and Buildah, and crun, and some other tools used in OpenShift and Red Hat customers via RHEL and other OS distros. There are different design ideologies between these two camps, and it really means it's unlikely that there will be absolutely common API for runtimes outside of that layer above, in the container runtime interface in Kubernetes.
Q&A
Wes Reisz [Track Host]: There was nerdctl containerd approach, does it use the same build API as the Dockerfile syntax?
Estes: Yes, so very similar to how Docker has been moving to using BuildKit as the build engine when you install Docker. That's available today using the Docker Buildx extensions. Nerdctl adopts the exact same capability, it's using BuildKit under the covers to handle building containers, which means it definitely supports Dockerfile directly.
Reisz: You said there towards the end, no clear path for commonality at the runtime almost kind of CRI, Podman, buildah, versus Docker containerd. Where do you see that going? Do you see that always being the case? Do you think there's going to be unification?
Estes: I think because of the abstraction where a lot of people aren't building around the runtime directly today, if you adopt OpenShift, you're going to use CRI-O, but was that a direct decision? No, it's probably because you like OpenShift the platform and some of those platform capabilities. Similarly, containerd is going to be used by a lot of managed services in the cloud, already is.
Because of those layers of platform abstraction, again, personal feeling is there's not a ton of focus on, I have to make a big choice between CRI-O or do I use Podman for my development environment, or should I try out nerdctl? Definitely in the developer tooling space, there's still potentially some churn there. I try and stay out of the fray, but you can watch on Twitter, there's the Podman adherents promoting Podman's new release in RHEL. It's not necessarily the level of container wars as when we saw Docker and Docker Swarm and Kubernetes.
I think it's more in the sense of the same kinds of things we see in the tooling space where you're going to make some choices, and the fact that I think people can now depend on interoperability because of OCI. There's no critical sense in which we need to have commonality here at that base layer, because I build with BuildKit. I run on OpenShift, and it's fine. The image works. It didn't matter the choice of build tool I used, or my GitHub Actions spits out an OCI image and puts it in the GitHub Container Registry. I can use that with Docker on Docker Desktop. I think the OCI has calmed any nervousness about that being a problem that there's different tools and different directions that the runtimes are going in.
Reisz: I meant to ask you about ko, because I wasn't familiar with it. I'm familiar with Cloud Native Buildpacks and the way that works. Is ko similar just from a Go perspective? It just doesn't require a Dockerfile, creates the OCI image from it. What does that actually look like?
Estes: The focus was really that simplification is I'm in the Go world, I don't really want to think about base images, and whether I'm choosing Alpine or Ubuntu or Debian. I'm building Go binaries that are fully isolated. They're going to be static, they don't need to link to other libraries. It's a streamline tool when you're in that world. They've made some nice connection points where it's not just building, but it's like, I can integrate this as a nice one line ko build, and push to Docker Hub. You get this nice, clean, very simple tool if you're in that Go microservice world. Because Go is easy to cross-compile, you can say, through an AMD64, or an Arm and a PowerPC 64 image all together in a multi-arch image named such and such. It's really focused on that Go microservice world.
Reisz: Have you been surprised or do you have an opinion on how people are using, some might say misusing, but using OCI images to do different things in the ecosystem?
Estes: Daniel and a few co-conspirators have done hilarious things with OCI images. At KubeCon LA last fall, they wrote a chat application that was using layers of OCI images to store the chat messages. By taking something to the extreme, showing an OCI image is just a bundle of content, and I could use it for whatever I want.
I think the artifact work in OCI, and if people haven't read about that, search on artifact working group or OCI artifacts, and you'll find a bunch of references. The fact is that, it makes sense that there are a set of things that an image is related to. If you're thinking object oriented, you know what this object is related to that. A signature is a component of an image or a SBOM, a software bill of materials is a component of an image. It makes sense for us to start to find ways to standardize this idea of what refers to an image.
There's a new part of the distribution spec being worked on called the Refers API. You can ask a registry like, I'm pulling this image, what things refer to it? The registry will hand back, here's a signature, or here's an SBOM, or here's how you can go find the source tarball for, if it's open source software, and it's under the GPL. I'm definitely on board with expanding the OCI, not the image model, but the artifact model that goes alongside images to say, yes, the registry has the capability to store other blobs of information. They make sense because they are actually related to the image itself. There's good work going on there.
Reisz: What's next for the OCI? You mentioned innovating up the stack. I'm curious, what's the threads look like? What's the conversation look like? What are you thinking about the OCI?
Estes: I think a major piece of that is the work I was just talking about. The artifact and Refers API are the next piece that we're trying to standardize. The container runtime spec, the image spec, as you expect, like these are things that people have built whole systems on, and they're no longer fast moving pieces. You can think of small tweaks, making sure we have in the standards all the right media types that reference new work, like encrypted layers, or new compression formats. These are things that are not like, that's the most exciting thing ever, but they're little incremental steps to make sure the specs stay up with where the industry is. The artifacts and Refers API are the big exciting things because they relate to hot topics like secure supply chain and image signing.
Some of the artifact work is like how, as people are going to build tools, that's already happening. You have security vendors building tools. You have Docker released their new beta of their SBOM generator tool. The OCI piece of that will be, ok, here's the standard way that you're going to put an SBOM in a registry. Here's how registries will hand that back to you when you ask for an images SBOM. The OCI's piece will again be standardizing and making sure that whether you use tools from the handful of security vendors and tools out there that they'll hopefully all use a standard way to associate that with an image.
See more presentations with transcripts