Mike McGarr, manager of developer productivity at Netflix, recently presented Better DexEx at Netflix: Polyglot and Containers at QCon New York 2018. He described how Netflix evolved from operating as a Java shop to supporting developer tools built with multiple programming languages. This has ultimately provided a better development experience for software engineers.
Netflix started operating out of a datacenter using a Java EE monolith application tied to an Oracle database. While this model worked well, the challenges of streaming and scaling required Netflix to migrate to adapt a new model that included Amazon Web Services using a Java microservices architecture tied to a Cassandra database. McGarr, having stated that this new model has made Netflix engineering famous, succinctly described the effort:
Through this transition, the engineering team made some amazing decisions and overcame some challenges in making this change.
Java at Netflix
As shown below, centralized teams at Netflix built developer tools with Java to support their customers, that is, software engineers at Netflix.
The Paved Road
The "paved road" at Netflix is their standardized path to build code, "bake" to immutable servers, and deploy to Amazon Web Services. It incorporates build tools such as Gradle, Jenkins, Spinnaker, and home-grown Nebula.
Netflix customers willing to follow this "paved road" will receive first-class support from Netflix.
Nebula
Nebula, created and open-sourced by Netflix, can be described in the What is Nebula? page:
Nebula is a collection of Gradle plugins built for engineers at Netflix. The goal of Nebula is to simplify common build, release, testing and packaging tasks needed for projects at Netflix. In building Nebula, we realized that many of these tasks were common needs in the industry, and it was worth open sourcing them.
OSPackage, a Nebula plugin, converts a Java application to a Debian package in preparation for the "baking" process. It's an important step within the "paved road" that has worked very well for Netflix.
Non-Java at Netflix
As demonstrated in the graph below, Netflix realized a shift in popularity with non-Java programming languages such as Python and JavaScript.
There was an increase in internal customer requests to support non-Java build tools to deploy Netflix cloud applications. Netflix responded by offering a special Gradle build to package a NodeJS application to a Debian with OSPackage. This way, the "paved road" was still maintained. However, NodeJS developers weren't keen on using Gradle, but Netflix continued to provide NodeJS support this way until they realized a tipping point. As McGarr stated:
That tipping point came in the form of a particular team at Netflix that made an architectural decision to move from a Java-based infrastructure, that was core to Netflix, to moving to containers and NodeJS. As soon as we heard of this project, knowing there was a subtle increase in JavaScript and Python, we realized that our answer could no longer work for the NodeJS community. So we had to make a change.
Their initial idea was to package a NodeJS application into a Debian using native JavaScript tools. Netflix wanted to build "Nebula for NodeJS" by finding the best JavaScript build tool and create plugins for it. However, Netflix discovered a vast "universe" of JavaScript build tools and patterns that aren't present in the Java community. For example, multiple JavaScript build tools, such as Grunt, Gulp, and Browserify, could be used in the same project because of their different use cases. This is much different from using a single Java build tool such as Gradle, Maven, or Ant. Because of this, Netflix couldn't create a simple build tool to serve the NodeJS community and had to differently approach the problem.
Developer Workflow
Netflix re-evaluated Nebula and discovered that, along with building code and dependency management, Nebula also, more importantly, provides developer workflow. Therefore, Netflix decided to create a brand new tool to address developer workflow with the following requirements:
- Language agnostic
- Native or native-like
- Reduce the cognitive load
Netflix Workflow Toolkit (Newt)
Newt, built with Google's Go programming language, is a new command-line developer workflow tool that satisfies the three requirements listed above. Netflix chose Go because it can compile a binary that is static (that is, no dependencies) and it can cross-compile to multiple operating system architectures.
Newt was initially built for NodeJS developers and features a number of built-in command-line parameters, a few of which we examine here.
newt package
is a utility that simplifies the NodeJS Debian packaging process. It points at a Docker image and executes Nebula. Despite the original goal to eliminate Nebula in NodeJS application development, Netflix felt it was important to still utilize it. As McGarr explained:
The problem with using Nebula for non-Java developers wasn't so much the tool itself, but the experience of having to install and maintain and learn something outside the ecosystem. But OSPackage worked very well and it's used all the time. It's very well vetted.
So the idea behind this was "let's throw it into a container," put Java, Gradle, and some build files in there, and distribute that container to all the developers. Whenever they want to run
newt package
, Newt will pull down that container, mount the local file system, run OSPackage inside the container and spit out a Debian. And they never have to touch Java.
McGarr emphasized that this is one of the best uses for containers because they can be used as a tool distribution and reuse mechanism.
NodeJS developers don't have to worry about installing Docker because the concept of Newt was to automatically configure a local workstation. newt setup
is a self-configuration utility that checks a predefined list of dependencies and automatically installs and updates whatever is missing or outdated. It was designed to automatically run in the background when other Newt commands are called, but may be explicitly called as necessary.
To build a NodeJS application, newt build
is a utility that reads a configuration file, .newt.yml
, that sits in the directory of each project and executes the code within that file. For example:
build-step: npm run-script build
This will call the command, npm run-script build
, which finds the NodeJS project, reads the package.json
file, and executes code defined in the scripts
section that contains the command, build
.
"scripts": {
"build": "npm install && eslint src/** && npm test"
}
Newt was designed to be language agnostic. As McGarr stated:
Newt has a very loose understanding of the world of build tools in that it's aware they exist and it can run them.
Newt knows how to build a project without knowing how to build a project. If the developer decides to switch to webpack or something else, Newt basically just runs that script.
To address the concern of versioning conflicts, versions of NodeJS and NPM can be explicitly defined in the .newt.yml
file:
build-script: npm run-script build
tool-versions:
- node: 6.9.1
- npm: 3.10.8
Borrowing from a Ruby pattern, newt exec
stores this version information in a special cache for use in the application.
Newt also created an error reporting mechanism, newt report-error
, to conveniently send error reports to the appropriate support staff.
Newt App Types
A Newt app type is a customized initialization that executes a predefined list of build tools to consistently create a new application. For example, node-ami-deploy
creates: a local git project from a template; publishes it on Netflix's git repository; creates any defined Jenkins jobs; and creates a Spinnaker delivery pipeline. It is initialized using newt init
command with the --app-type
flag as follows:
newt init --app-type node-ami-deploy
An app type may also be defined in a .newt.yml
file.
app-type: node-ami-deploy
A number of predefined app types have been created within Netflix. The newt init
command provides a complete list.
Netflix discovered that Newt is a platform, not a tool, as Newt supports a cross-section of build tools for Java, JavaScript, and Python with the potential to support other languages in the future.
Unlike Nebula, Newt is only available within Netflix. However, McGarr stated that requests from the outside developer community could make it possible to one day open source Newt.
Lessons Learned
In the end, Netflix has learned:
- Polyglot can be expensive
- Containers make for great tool distribution
- Build platforms, not just tools
- Provide native (or native-like) solutions
- Reduce cognitive load
McGarr spoke to InfoQ about centralized teams at Netflix.
InfoQ: What, in particular, is the most important aspect about Netflix application development that you would like to share with our readers?
Mike McGarr: Our most important aspect with Netflix application development is our culture such that our customers are able to work with high trust and plenty of freedom and responsibility. Our managers provide context and do not control decisions. And we've found huge gains across the board with our customers.
InfoQ: What are the prospects of Newt being open-sourced?
McGarr: Only time will tell. We don't want to open-source Newt and "walk away." We need to evaluate the cost of internal versus external support and that would require additional resources to make that happen.
InfoQ: What's on the horizon for Netflix application development?
McGarr: We are continually looking for ways to improve the productivity and reduce cognitive load of our software engineers as they move through working with containers, cloud applications, agile processes, etc. This is a trend that I have noticed in other companies as well.
InfoQ: What are your current responsibilities, that is, what do you do on a day-to-day basis?
McGarr: As an engineering manager, I need to ensure that we have the right people and the resources they need to be successful. I also recruit, both by reaching out to candidates, as well as building "marketing" material to talk about the work we are doing.
I spend a lot of time meeting with other people at Netflix to collect information about what's going on, so I can provide context to my team. I focus on the "what" and the team provides the "how." This way, Netflix engineers have enough time to focus on flow.
Resources
- How We Build Code at Netflix by Netflix Technology Blog (March 9, 2016)
- The Evolution of Container Usage at Netflix by Netflix Technology Blog (April 17, 2017)