Vincent Batts, senior software engineer at Red Hat, talked about Linux containers and Docker, covering the different storage drivers pros and cons, the image format and signing of images, at the virtualization developer room at FOSDEM.
Containers are based on technologies that have been around for a long time, using Linux kernel features:
- Resource control groups (cgroups), that limit, account for and isolate resource usage, such as memory, CPU, disk I/O, etc.
- Namespace isolation, where groups of processes are separated so they can not view or access resources in other groups.
Docker popularized the concept of an ephemeral container, where the container is launched from a clean state to reach an expected one, having the ability of managing prior snapshots, with commands such as inspect, restart or commit.
The underlying storage drivers in Docker are swappable and provide different features, performance and stability:
- Vfs: a no thrills, no magic, storage driver, and one of the few that can run Docker in Docker.
- Aufs: a fast, memory hungry, not upstreamed driver, which is only present in the Ubuntu Kernel. If the system has the aufs utilities installed, Docker would use it. It eats a lot of memory in cases where there are a lot of start/stop container events, and has issues in some edge cases, which may be difficult to debug.
- Devicemapper: the default driver, generally supported, and getting better with lots of contributions. Uses thin provisioning and "copy on write" snapshots It is the slowest driver, but can be tuned to improve its performance in production. It needs two block devices, for metadata and data, and should not be used for high I/O requirements.
- Btrfs: a fast, experimental driver, which still has some bugs and could create some data corruption.
- OverlayFS: a new driver using a union filesystem, included upstream, and very fast. It was finally merged in the 3.18.0 kernel and is heavily iterated on.
Vincent advice on choosing a storage driver:
There is no clear winner. The defaults are generic, learn which one works best for your use case.
The container layers are a new concept introduced in Docker that did not exist in LXC. They are immutable read-only, with a read-write snapshot on top. Diffs are a big performance area because the storage driver needs to calculate differences between the layers, and it is particular to each driver. Btrfs is fast because it does some of the diff operations natively.
The Docker portable image format is composed of tar archives that are largely for transit:
- Committing container to image with
commit
. - Docker
push
andsave
. - Docker
build
to add context to existing image.
When creating an image, Docker will diff each layer and create a tar archive of just the differences. When pulling, it will expand the tar in the filesystem. If you pull and push again, the tarball will change, because it went through a mutation process, permissions, file attributes or timestamps may have changed.
Signing images is very challenging, because, despite images being mounted as read only, the image layer is reassembled every time. Can be done externally with docker save
to create a tarball and using gpg
to sign the archive.