BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage Presentations Reproducible Development with Containers

Reproducible Development with Containers

Bookmarks
36:49

Summary

Avdi Grimm describes the future of development, which is already here. Get a tour of a devcontainer, and contrast it with a deployment container.

Bio

Avdi Grimm is a consulting pair-programmer, the author of several popular Ruby programming books, and a recipient of the Ruby Hero award for service to the Ruby community. He helps developers dance with code at Graceful.Dev.

About the conference

QCon Plus is a virtual conference for senior software engineers and architects that covers the trends, best practices, and solutions leveraged by the world's most innovative software organizations.

Transcript

Grimm: The other day, I took my car to the car wash. It's one of those bougie car washes where you hand your car over to the attendants, and then you go and wait while they clean it inside and out. I had some time to kill. Before I went to get my car washed, I had been right in the middle of some coding work. All I had there was a Wi-Fi connection and the tiny Chromebook that I'd thrown in my bag. I brought up my project on GitHub, I opened it up in GitHub Codespaces. I started right where I left off with my full development environment running in the cloud. Not just the editor either, a whole virtual machine customized for my project. This talk is not about Codespaces. It's not about cloud based development. It is about spinning up a whole development environment from zero in just a few seconds or minutes.

Outline

This talk is about upping your expectations for developer experience. It's about taking initial project setup from ordeal to non-event. It's about packaging your development environment right along with your code. It's about leveling the playing field on your team, such that everyone benefits from those teammates who like to tweak their environment for maximum efficiency. It's about eliminating unreproducible problems where the database won't start on one developer's machine, but none of these old hands can help them because it works on their machine. It's about anyone being able to contribute a few lines of code when they're bored at the car wash. It's about a future of developer experience that you can start using right now. It's about containers for development, otherwise known as devcontainers.

What Is a Devcontainer?

When I say container, I'm talking about the container you typically run with Docker. This talk is not an introduction to Docker. I'm going to assume that you have some basic familiarity with containerization. The implication here is also that your project runs in Linux, which is true of most web application development these days. If you're targeting iOS, or the Windows desktop, or some other non-Unix like platform, what follows may be less applicable to your project. Even though this is not an introduction to Docker, it is worth very briefly talking about what makes containers so much better suited to developing inside them than some of the older virtualization technologies like Parallels, or VirtualBox, or Vagrant. In a nutshell, it's because containers aren't virtualization at all.

Yes, containers give us something that looks like a tiny computer inside another computer. Rather than trying to simulate a computer, a container works by creating an isolated set of namespaces, including the file system namespace, network ports, the process table, all the other namespaces that go into a running operating system. Which means that unlike virtualization, containers have the potential to run project code and tools at native speeds without bringing a development machine to its knees. Because the host operating system can map files into their container namespace, we can edit source code using native tools while running the code inside the container. Also, unlike virtualization technology, devcontainers aren't opaque binary images to be passed around. We determine what Linux version the system packages and libraries, utilities, file system mappings, open ports, supporting services, all the stuff that will go into a devcontainer, we reflect that using human readable configuration files that get versioned right alongside the project source code.

In effect, a devcontainer is a fully functional, batteries included development environment that is shared, versioned, reproducible, self-documenting and always up to date, so long as it's in use. A devcontainer is like the ramen noodles of development environments, just add hot water and you're ready to go. This talk is also not going to be a tutorial. Building out a full devcontainer is an ongoing iterative process. One that's very specific to your project. Instead, I'm going to give you a tour of what a devcontainer can look like, and what it can feel like to use one to be part of a team with a reproducible development environment.

Getting Started With the Foobar Project

Recently, I joined a client for a six-month engagement. Like most teams with a large, old project, they had a lengthy set of initial setup instructions and scripts scattered across their README and Wiki pages. As always, parts of the directions were outdated or contradictory. The setup scripts had very specific implicit expectations that they'd be running on a brand new MacBook with a particular version of macOS, and that this laptop would be dedicated to developing on this project. It was fine to make global configuration changes to it. This confusing, lengthy and constantly outdated onboarding process is the norm on most project teams I've seen. If, during your first full week of onboarding, you can get part of the project's test suite running, you're doing pretty well.

Devcontainer Configuration Files

The first thing that I did when I joined this team was I created a devcontainer configuration that turned all of this documentation into executable configuration. To create a devcontainer, we make a set of configuration files for Docker, and we make them separate from any Docker configuration files that already exist for some deployment container. We usually put them in a devcontainer directory in the project repository. At the very least, this includes a Docker Compose configuration file, which defines what container or containers to start up, and how to connect them to each other and to the host computer. Typically, there's also a Dockerfile to customize the app development container. Devcontainers tend to accumulate some script files as well. These are hooked into various points in the container lifecycle. If we're using VS Code as our editor, there's also a devcontainer.json file.

I put this together, and I committed it all to the repository. Some of the folks on the team were like, we're never going to use it, but you do you. Then, a funny thing happened. While I was there, several of the new hires used the devcontainer to get up and developing more or less instantly. Someone else from another team used the devcontainer to make PRs on a code base they didn't usually work on without having to spend a week getting it set up. By the time I moved on, the devcontainer had become one of my most lasting and appreciated contributions.

Turn Onboarding into a Non-Event

Turning onboarding into a non-event is one of the most obvious and immediate benefits of developing in containers. It's not just for new hires, it could mean someone from your frontend team is able to jump in and make tweaks to your backend code. It could mean you, three years from now, being able to quickly come back and fix a bug. Projects setup, checklists, and scripts, they quickly go stale, because once we have the project configured on a machine, we don't think about them again. Devcontainers are regularly rebuilt whenever anybody tweaks them. A devcontainer is executable documentation of what libraries, services, system configurations, open ports, nice to have utilities, all go into day-to-day development with your project. For instance, does your team sometimes use ngrok to expose a local development machine to a remote user? Don't write the setup instructions in the wiki, add it to the devcontainer, then everyone who uses the devcontainer will have the right tools when they need them.

I've also seen more projects where you're considered to be doing well if you can run the unit tests locally, but only the CI system has all the right magical invocations and extra supporting services to run the system or the integration tests. In the extreme case, only a select few infrastructure knowns know how to fix the system tests when they don't work, which can leave developers twiddling their thumbs when their changes break the build. With a devcontainer, one that everyone shares, and that's also used in CI, we can switch our expectations to everyone can run all the tests all the time. They might still run faster in parallel on CI but keeping that integration test passing becomes everyone's business.

Devcontainers can support the full testing cycle because they are able to package up not just a tiny computer for developing on the app itself, but also the constellation of supporting services needed to run the app. Does the app need a Redis server and a particular version of PostgreSQL with specific extensions installed? A Docker Compose configuration can ensure that these are spun up, available, and connected when the devcontainer has started. The devcontainer can even include wizardly tweaks from the one Postgres expert on the team to optimize the development database server for responsiveness over reliability. Speaking of local database servers, have you ever had to install a specific system library or version of PostgreSQL to satisfy one app, only to have that break another app you're working on? With devcontainers, you can switch between multiple projects on one machine. This is essential if you're a consultant, but it's applicable to any organization that has more than one code base.

Language Runtime Version Managers

While we're talking about switching between projects, if you're used to working with languages like Python, or Ruby, or JavaScript, you're used to having to deal with version managers like VirtualEnv, or RVM, or NVM. These tools build and install and manage multiple versions of Python or Ruby or Node side by side to ensure that each project uses just the right version of the language runtime. These tools add an extra level of indirection. They are a hassle at the best of times. When you're dealing with a language ecosystem you're unfamiliar with, they can be an added obstacle to becoming productive. Using devcontainers eliminates this entire class of utility. Since a devcontainer is dedicated to a single project, it can have the correct version of Ruby or Python or JavaScript installed globally. If this means compiling the runtime from source, that can be rolled into the devcontainer's Dockerfile. I haven't had to touch a language version manager since I started using devcontainers for everything. I do not miss it.

Over time, a lot of projects evolve a standard set of Shell aliases and Git aliases that shorten common actions. Some of them are basic ones that are applicable to any project. There are often a few shortcuts that are very specific to how one team works with their app. Usually, these just get spread by someone on the team evangelizing them until others slowly adopt them. It can be jarring to be pairing on some code and then realize that the shortcuts you're used to using aren't there. The presence or the absence of these shortcuts, in my experience can also lead to a subtle social partitioning of the team into the cool kids who always have the best shell aliases, and the uncool kids who lag behind. What if anyone on the team could instantly add a useful shell alias for everyone else? That's exactly what you can do when you're all using a devcontainer. Instead of posting the shell alias in Slack, you can make a PR that adds it for everyone, and then show off how to use it in Slack. Since the devcontainer contains a common shared Unix user space, you can be sure that those shortcuts will work for everyone.

One of the biggest benefits of devcontainers shows up once most people on the team are using it. Have you ever had one developer on your team suddenly start to have an issue that no one else sees? Eventually, it turns out that they received some system update that was incompatible with one of the libraries the project depends on, but no one knew how to help them because it worked on their machine. Consistently using devcontainers can drastically cut down on the 'works on my machine' phenomenon. Nothing can make individual developer environments perfectly identical, but having a common container definition can eliminate a huge number of potential variables. Once you nail down whatever library update broke the project, you can easily fix it for everyone. Because with a container, you can get as specific and locked down as you need to get with utilities and system library versions. It helps with those 'works on my machine' problems.

Cloud Based Development Environments

Once you've got a devcontainer definition, you're not limited to my machine at all. Cloud based development using tools like Amazon Cloud9, or GitHub Codespaces is a thing now, and it's only going to become more of a thing. Cloud based development environments enable remote pair programming. They give you the ability to drop in and write some code wherever you have a browser, even if you accidentally left your laptop bag in the train station in Amsterdam. If you have a devcontainer definition that works locally, you can fire up an IDE in the cloud. Devcontainers are very applicable to open source development as well. Have you ever wanted to contribute a small change to an open source project, but when you pulled the code down, you realized it was a long and involved process to get the unit tests running? You gave up and you dropped a suggestion in the bug tracker. What if open source projects came with devcontainers that could make casual contributors immediately productive? It might feel a lot more inviting.

Editors and IDEs

Let's talk about editors and IDEs. IDEs are starting to add features to embrace container based development. VS Code is definitely at the forefront of this trend. In fact, a lot of my thinking about devcontainers, including the term devcontainer has been inspired by the way VS Code incorporates container support. As long as you have the VS Code remoting extensions installed, in any project, you can ask VS Code to open the project in a devcontainer. If there isn't already a devcontainer configuration, you can choose from a list of templates to get you started. If there is a devcontainer configuration, VS Code will fire up the appropriate containers and effectively restart itself inside the main project container. This integration, it's more than skin deep. VS Code actually arranges to run part of itself including a lot of its extensions inside the container. This eliminates a huge class of common problems with remote development, where extensions trigger the wrong version of an executable or get confused by path translations. When we start up a shell or do anything else that would run a system command, that command is executed in the devcontainer. This tight integration of a devcontainer with a devcontainer-aware IDE can help make developers more effective as soon as they start working on a project.

For instance, it's common these days for a project to have linting or formatting rules that are customized for that code base. Traditionally, developers new to the project would have to install the linting tool and make sure their editor was correctly configured to use it. With a container-aware editor configuration, linting and code formatting can be working out of the box as soon as the developer fires up the project for the first time. This is not to say that using a devcontainer locks everyone down to using the exact same configuration, far from it. For instance, when you're using VS Code, the devcontainer can include a base level of project specific settings and plugins. You can also layer your own settings, your own plugins, your own color schemes and key bindings on top of that. For that matter, there is no rule that your team has to settle on a single editor either. A project can incorporate devcontainer-aware configuration for multiple IDEs. You could include a full Vim setup right in the devcontainer, including the editor itself. VS Code might be the leader here, I've noticed that the JetBrains IDEs have also been adding features for developing inside containers. Hopefully, this trend will continue.

Why a Separate Development Container?

No tool or technique is a panacea. I want to talk about some cases where you might not want to use devcontainers. Before I get there, I do want to talk about one of the most common tripping points I've seen in rolling out devcontainers for a project. It goes like this, we already have a container definition. Can we reuse it? Or on the flip side of that we see, yes, this devcontainer stuff doesn't apply to us because we're not using containers to deploy. I think both of these points of resistance stem from the same false premise, the idea that containers are always for deployment. This is an understandable belief. If you're using containers at all in your project, it's probably because that's how you're deploying your application. You might also be using containers in your continuous integration infrastructure. Isn't that what containers are for?

It's true that deployment is the use case that has popularized containers. Devcontainers are useful whether you're deploying containers or not. In fact, thinking of devcontainers as fancy deployment boxes is a good way to miss out on most of their power. Because here's the thing, containers for deployment have very different needs than containers for development. In fact, a lot of the pressures on deployment containers are almost diametrically opposed to the pressures on devcontainers. We want deployment containers to be as small and stripped down as possible. We want them to be lean, fast, and security hardened. That means minimize non-essential libraries and tools. It may mean using a base image such as Debian-slim or even Alpine Linux, which lacks the usual glibc libraries that are found in an ordinary Linux distribution. For a devcontainer, we're trying to provide a full and comfortable development environment. That means a batteries included Linux distribution, command line tools, compilers, man-pages, the whole kit and caboodle.

For deployment, you want to minimize the security cross section. For development, you want to maximize things like ports that are open for debugging. For deployment, it's an error not to be talking to your observability services, like Honeycomb or New Relic. In development, you don't want to be sending messages to those services, and you may want to fake out or stub out some other external services. In deployment, you want to optimize your Docker builds for the smallest number of layers. While in development, you may want to optimize for quickly adding incremental changes that don't require a full image rebuild. In these and other ways, the goals of deployment containers and devcontainers are opposed to each other. That's why when I start in a new client and begin building a devcontainer, I normally start from scratch. I build a brand new set of container configuration files, working from the project setup instructions, rather than from any existing Dockerfiles. This gives me a portable, reproducible environment that's built for development, not for deployment. This doesn't mean that your devcontainer and deployment container configurations can't share some parts in common. You'll probably find that it's a lot easier to start with your devcontainer and strip it down to a deployment container than it is to start with a deployment container and build it up into a comfortable development environment.

Counter-indications

With all this said, devcontainers aren't right for every project. Everything we've talked about so far is predicated on running containers in Docker, which is a Linux based technology. Most web and enterprise applications are deployed to Linux based servers these days, so developing in a container means developing in something close to the delivery environment. The same goes for Android development. If your deployment target is not a Linux or Linux-like system, you may not want to go down this road. If you're targeting iOS devices or Windows native, containerized development may not be the best investment for you right now.

Docker Desktop Host Platforms

Also, as of 2021, there's a clear hierarchy in desktop platforms for Docker based development. Running Docker on a Linux based machine is probably the best experience since Linux is the native host for containers. Perhaps surprisingly, the next best option these days is Windows. That's because with the advent of the Windows subsystem for Linux version two, or WSL 2, Windows now runs Linux natively in parallel with the Windows kernel. You can actually pick your Linux distro of choice right out of the Windows Store and start running Linux binaries straight from the Debian or Fedora repositories, without any recompilation or emulation. Docker Desktop on Windows uses WSL 2 for its backend, which means that Docker containers on Windows are effectively running in their native Linux habitat, with no virtualization performance penalty. In my usage, it's stable, and it runs Rails projects at native speeds.

MacOS is built on BSD, not Linux, and it doesn't have a WSL 2 equivalent, which means that some level of virtualization is involved in order to make Docker work. I don't use Macs for development anymore. I've heard from friends that they experience some flakiness and performance issues with Docker, particularly around file I/O. What to do about this? Hopefully this will be the most quickly dated portion of this talk. With any luck, Apple and Docker will find a way to sort out these issues sooner rather than later. There are also some tricks you can do to optimize file I/O performance in Docker.

Summary

There are certain technologies that once they mature, change the development state of the art. Back when I started programming, version control still wasn't universally embraced. Some projects still relied on periodically zipping up copies of the code for history. Over the course of my career, version control became universal. More recently, continuous integration went from being a novel new idea to being an industry standard. Today, both distributed version control and continuous integration are table stakes. We can barely imagine a software project without them. In 2021, we're at the beginning of this inflection point for development in containers. In five years, we'll laugh about how we used to think it was normal to spend days getting our developer laptop set up before making our first commits to a project.

You don't have to wait that long. With a little effort, you can already have all the benefits of devcontainers for yourself and for your team. You can have a portable, reproducible development environment that follows you from machine to machine and even into the cloud. You can get new hires up and running in hours instead of days. You can make it easier to contribute to your open source projects. You can make sure that every test that runs in CI can also be run locally. You can share your specialized development configurations and scripts with your teammates with a push to GitHub. You can do all this by committing to making devcontainers a normal part of your project's development workflow. That's why I think you should drop everything and create a devcontainer definition for your current project. Not only that, you should work inside that container and improve it until it's comfortable, so comfortable it feels like home. Your collaborators will thank you and your future self will thank you.

Resources

Of course, this has just been a teaser. If you want to learn even more, check out the course I've been putting together all about devcontainers. It starts with Docker basics and walks you through building a devcontainer from scratch. You can find it on my site at graceful.dev/devcontainers.

Questions and Answers

Kerr: The last one is about Docker for Windows being a paid program when used in enterprise. Do you happen to know how much that costs?

Grimm: I would be curious. I'd seen that item and my initial perspective is, good for them.

Kerr: True. Of course, Docker to continue to be a company. Five dollars a month per dev. Why does that amount of money matter?

Grimm: How much is the dev worth per hour?

Kerr: We pay $1,000 a year or something for IDEs. A lot more if you're in the Microsoft space, I think.

Grimm: Good for them for supporting themselves.

Kerr: Crucially though, it's free for personal use, which means we can use it for open source, which means we can use it at home.

Grimm: Also, let's be clear here. Docker itself is a set of open source Linux technologies. It's a set of basically pieces of the Linux kernel. Docker for desktop offers a bunch of services to tie that together. There are ways right now of running containers without using Docker to run them on Linux. I'm sure that stuff could be translated if this ever became an issue to WSL 2. It's not secret sauce. The actual containerization, none of that is secret sauce. What Docker Desktop adds is gooey stuff on top of that.

Kerr: To make it easy. Because Docker for desktop has made this a lot easier.

Grimm: We should distinguish between desktop and the actual Docker technologies.

Kerr: You mentioned that you think containerized development and presumably remote development. Phillipa talked about remote development at Netflix, it seems to be just a matter of time before they get the remote development experience smooth enough, or roll it out as the standard there. Definitely moving in that direction. With version control, it hasn't been a smooth ride. Some of us actually learned to deal with the Git command line. What are the bumps that we're still having in using devcontainers?

Grimm: What are the bumps? It's a separate set of skills. The first bump is just that you need to understand a Docker Compose file, and then a Dockerfile. That was one of the big bumps for me, was just comprehending that ecosystem.

Kerr: For me, when I first started using Docker, the Linux admin, suddenly I'm like an administrator of a tiny pretend machine on my computer. I got to learn things about like /proc.

Grimm: I benefited from years of doing that admin here and there anyway. There's fiddly things like, you need to expose your web app on all interfaces, not 127.0.0.1.

Kerr: What is a network interface?

Grimm: You have to expose it on 0000.

Kerr: That was a thing that I now have some idea about.

Grimm: It's not free. I think most of what you pick up winds up being useful understandings about how your app is exposed to the world. It's not free.

Kerr: How do you handle secrets for your devcontainers, like accessing source control and Nexus Repos?

Grimm: The main thing is, usually I have a dotenv file, which is not versioned. There might be like a dotenv.example that's versioned, but the actual dotenv file isn't versioned. Docker Compose has built-in support for dotenv, so when it's bringing your container up, it'll load in your dotenv file as if it was all environment variables set in your environment.

Kerr: You can store your secrets on a file on your local drive and Docker will mount them into the container?

Grimm: Yes, basically. Maybe won't mount them, but it'll make them part of the environment.

Kerr: Secrets, not trivial, but definitely possible and increasingly supported.

They are having issues convincing their teams to support and improve their devcontainer definition by themselves. They would have hoped a central team would do that for them like an external tool. Any advice?

Grimm: It is daunting at first. It was daunting for me at first. A lot of the ways that Docker is taught right now or the information you find about it is so heavily slanted towards deployment, which is a different world, and you wind up spending a lot of time learning something that turns out not to be all that important for running it locally. Somebody on your team is going to be interested in Docker hopefully and can write that initial devcontainer and then write the README or add the stuff to the README that says, here's how you start it up.

Kerr: By all means, maybe your central team supplies a starting point, but you have to pulse up a paved path.

Grimm: Depending on the size of your enterprise.

Kerr: That's a great thing to have.

Grimm: It really depends on the size of your team.

Kerr: Hopefully somebody on your team will have an interest because this is like all of those technologies that we need. This puts a little bit more Ops in the DevOps on the development side.

Grimm: Which is a good thing.

Kerr: It's a little piece of reality that we start having to care about, but at least it's safe and local to be able to log into production containers.

Grimm: One of the things that I think is a benefit of this is, I really think it's useful, like if you are deploying using containers, and more of us are, I think it's becoming a core part of the stuff that you need to know, as a developer, is, how does this containerized deployment work?

Kerr: It used to be the hardcore devs who understood properties of Linux. Strangely, it's coming back. It's more important than it was 10 years ago, instead of being further abstracted away from us.

Grimm: It's the good parts of Linux. It's not the annoying parts, like how to make sound work.

Kerr: Windows with WSL 2, it's like Linux with sound.

Grimm: That's how you make sound work on Linux is you run it under Windows.

Kerr: Satish would like to get the sound to work. I just don't on Linux.

Grimm: Out of scope.

Kerr: Probably your applications don't depend on that.

Grimm: If they do, that is actually a thing that the WSL developers have been working on, is just making that work out of the box, which, yes, pretty cool.

Kerr: That has been a challenge for Linux on the desktop.

Is it possible to run an IDE such as Eclipse or applications that don't have container support?

Grimm: The JetBrains IDEs have been building in container support.

Kerr: You don't have to have container support. Your editor doesn't have to know that you have a container running.

Grimm: There's two or three levels of this. The base level is, when you're developing in a devcontainer, your files are still on your host machine, you're mapping them into the container, but your source files are on the host machine. Right from the start, you can continue to use any editor that you want. Having the editor start your application or start the debugger in the devcontainer, that's the integration. That's where you start needing the IDE or the editor to understand.

Kerr: In a sense [inaudible 00:33:46] debugging. You can set up your container to run the app with a debugging port expose. You can expose that on container, and you can tell your IDE to connect to that. You can debug from your IDE while it's running. You'd have to set it up.

Grimm: Even in an IDE that doesn't understand this whole container ecosystem, you can still expose the ports in any environment that I'm familiar with. Usually there's a way to expose the debugging port. JetBrains has also been adding support for like, tell it that you're running inside a particular Docker Compose configuration, and when you try to run a command, it'll try to do that inside the container rather than trying to do it on the host machine.

Kerr: Do you have an opinion on the file syncing between containers in Linux and the Windows file system?

Grimm: That's pro tip, is if you are doing this using Windows and WSL 2, your files need to be in the Linux file system that that exposes. You can do it with your files on the Windows side of the file system, but it's slow and it's flaky. Now that Windows has direct access to the Linux file system side, it's a no-brainer. You just throw your stuff in there, everything works. You can still get to everything from your Windows apps. Also, it just runs in native speed inside of all your containers.

Kerr: You mentioned that on the team that you built the devcontainer for, new people found it really valuable. What about the people who already had everything installed?

Grimm: There might have been one or two people who had stuff installed that started playing with the devcontainer. When somebody is comfortable in their environment, they've got that comfy couch that they've worn in, it's hard to move to a devcontainer.

Kerr: They've got their mom's Ramen recipe and that's the only way that mom has been doing.

Grimm: Yes. I totally swiped the dotdevcontainer thing from VS Code defaults. That's where I got the word too. I was like, "Yes, devcontainer. That makes sense."

Kerr: Microsoft has put a lot of thought into this.

 

See more presentations with transcripts

 

Recorded at:

May 13, 2022

BT