In the first chapter of the book "Test-Driven Infrastructure with Chef, 2nd Edition", Stephen Nelson-Smith discusses the philosophy behind Test-Driven Infrastructure. He lists two fundamental philosophical points:
- Infrastructure can and should be treated as code
- Infrastructure developers should adhere to the same principles of professionalism as other software developers.
Then, he dives deeper into each of them.
In 2006, the emergence of utility computing like Amazon EC2 and 2nd generation web frameworks like Ruby on Rails enabled even the smallest teams to create Internet-scale applications. Soon Puppet and then, a few years later, Chef were introduced to help people manage their large scale web infrastructures. Stephen says in his book:
Through co-design of the infrastructure code that runs an application, we give operational responsibilities to developers. By focusing on design and the software life‐ cycle, we liberate system administrators to think at higher levels of abstraction.
After digging into the history of Infrastructure as Code, Stephen lists some principles of Infrastructure as Code like
- Break the infrastructure down into independent, reusable, network-accessible services.
- Integrate these services in such a way as to produce the functionality our infra‐ structure requires.
Then he highlights the main risks of an Infrastructure as Code-approach like duplication and a lack of clear understanding of infrastructure code, fear of changing the code, or dependencies on a few key people who are the only ones able to understand the infrastructure code.
To mitigate those risks, Stephen demands that infrastructure code needs the same caution and professionalism to write it as application code. Things like modular design, collective code ownership, code reviews, code standards, refactoring, and testing of infrastructure code are necessary practices when writing any kind of code.
After introducing the philosophy, Stephen dedicates full chapters to an introduction to Ruby, and introduction to Chef, and the description of basic tools people commonly use in conjunction with Chef like VirtualBox and Vagrant.
In chapter five, Stephen introduces Test- and Behaviour-Driven Development.
Now the reader should be equipped with the basic understanding of all the concepts and tools required to start with test-driven development of infrastructure code.
In chapter six, Stephen introduces his conceptual framework for Test-Driven Infrastructure called MASCOT. He says:
Test-driven infrastructure should be:
- Mainstream
- Automated
- Side effect aware
- Continuously integrated
- Outside-in
- Test-first
Describing each point in greater detail fills the remainder of chapter six.
Chapter seven recommends a tool chain for test-driven infrastructure. He starts the chapter with re-emphasizing the two philosophical foundations of test-driven infrastructure:
- Infrastructure can and should be treated as code.
- Infrastructure developers should adhere to the same principles of professionalism as other software developers.
Then he describes the different types of testing: unit testing, integration testing, and acceptance testing and the testing workflow to show when each of those types comes into play.
After describing the testing types and the testing workflow, Stephen introduces his recommended set of tools:
- Berkshelf for managing cookbook dependencies
- Vagrant for managing virtual machines for running tests
- Test Kitchen for orchestrating tests across multiple nodes and platforms
- Cucumber and Leibniz for acceptance testing
- Serverspec and Bats in conjunction with Test Kitchen
- Minitest Handler for integration testing
- Chefspec for unit testing
- Static analysis and linting tools like Foodcritic, Knife Cookbook Test, Tailor, and Strainer
Each of the above sections comes with detailed getting started instructions, code examples and a description of pros and cons of every tool. This section is the most practical part of the book where the reader can follow along and get going with the described tools. Stephen finishes the book with a summary on how to use each of the tools in the testing workflow.
Stephen was also kind to answer several questions related to his book.
InfoQ: What was your motivation to write that second edition of TDI?
Stephen: The first book was really a toe in the water - a manifesto. But I felt rather like John the Baptist - a voice crying in the wilderness! I had a fierce sense that we needed to apply test-first principles and BDD to infrastructure code, and I wanted to share that vision. A short while after the publication of the first edition, it was clear I'd hit a nerve - the ecosystem had blossomed and everyone was talking about test-driven infrastructure. It was clearly necessary to update and expand the first edition, to meet demands, and besides, I think I had more to say! I wouldn't be surprised to see the same happen again, as the world matures and expands further.
InfoQ: You state two philosophical foundations for TDI:
- Infrastructure can and should be treated as code.
- Infrastructure developers should adhere to the same principles of professionalism as other software developers.
What are the biggest objections you hear about regarding those two?
Stephen: To be honest I think I rarely hear objections to the first any more - it's become almost a mainstream view. However on occasions when opposition is met I think it's usually down to two things. Firstly, there's a concern, especially at very small scale, that writing infrastructure code is wasteful - just build the machine and be done with it, and secondly, especially in more traditional environments, there's sometimes a reluctance or insecurity from IT professionals who are uncomfortable at the idea that they might need to learn to program.
Regarding the second, I think this is universally accepted. But I think my particular interpretation that professionalism means test-first, and continuously integrated is the state of the art isn't always accepted by engineers who are rightly proud of their automation efforts. On those occasions I think it's important not to come across as elitist or narrow-minded - I'm certainly not saying that engineers building out automation systems without an XP-derived test-first approach are unprofessional. What I am saying is that we need to recognise that the software we're developing is underpinning what is often the core of the business. We wouldn't deploy the applications into production without an automated testing mechanism - we need to treat our infrastructure in the same way.
InfoQ: You state that: "Testing our infrastructure code, thoroughly and repeatably, is non-negotiable, and is an essential component of the infrastructure developer’s work." Your book tries to provide encouragement and the necessary know how to follow this statement. What do you tell a aspiring infrastructure developer if he asks you how to do this with all the fire fighting going on in his job?
Stephen: Fire-fighting has always been a part of the sysadmin's life. We knew that when we got into the business. But the smart sysadmin understands that they always need to be thinking beyond the fix. What can we do to make the system more dependable, how can I reduce the likelihood of this issue happening again? I think it's now widely accepted that building and maintaining our infrastructure with an automation framework such as Chef helps this endeavour, and that applying the principles I advocate will ultimately result in a reduction in firefighting. I have much sympathy for the overworked sysadmin, but I think we need to be mindful of the adage of the sweat-drenched wood-cutter, labouring for hours against a particularly tough tree, insisting that they have no time to sharpen their saw, because they're too busy cutting wood...
InfoQ: In Chapter 6 you define a conceptual framework for TDI based on your MASCOT principles:
Test-driven infrastructure should be:
- Mainstream
- Automated
- Side effect aware
- Continuously integrated
- Outside-in
- Test-first
How far do you think the adoption of TDI has come so far and what do you see for the future?
Stephen: From my observations within my community, which is, admittedly, skewed in this direction anyway, I think the principles and tooling are widely embraced. I think I see this mindset expanding throughout the industry. I don't think it's quite mainstream yet, but it's certainly on the way.
In terms of automation, I think this is close to nailed. Advancements in frameworks like Test-Kitchen have made a big difference, and as some of the work around continuous delivery/deployment of infrastructure that people like Chris McLimmans, myself and others at Chef now beginning to trickle into the open source space, I think we're in good share here.
Regarding side-effects, I think people are very keenly aware now of the ripple effect of even small changes, and as it becomes more common to be running tests via Jenkins or Travis, I see this awareness very much on the increase. This I think leads into the CI conversation - the tooling and the attitudes are almost at the level that we can start to expect all infrastructure developers to have all their cookbooks continuously integrated. The gap for me, at present, is that the support for Windows is still a bit behind that of Linux - but there's momentum to fix this, and I'm excited to see it.
The last two are the least mature. The advances in ChefSpec 3.0 and the release and support of Test Kitchen 1.0 has made unit and integration testing relatively accessible. However, the type of outside-in acceptance testing - specification by example - is still not yet given as much attention as I'd like to see. As for test-first - I'm a firm believer in this as a workflow, but it's an acquired taste, and takes a certain degree of discipline to pursue. More often than not, I see people retro-fitting tests after they write their recipes. I'd like to see that the other way around. I think perhaps his needs a bit more evangelism and demonstration.
InfoQ: You propose an inside out approach to testing your infrastructure using cucumber for acceptance testing and RSpec for unit testing. You use test-kitchen (and the Leibniz Ruby gem) to automate testing. What do you think about the tool landscape in the TDI space? How will it evolve over the next 2-3 years?
Stephen: I think we're beginning to settle on a common toolset for unit and integration testing - based around RSpec (ChefSpec, ServerSpec) and enabled by Test Kitchen. What I'd hope and expect to see more of in the next few years is work on how to do multi-node, stack-level acceptance testing. Leibniz is a very simple gem - all it really does is expose helpers to launch infrastructure with a given run list, enabling developers to write meaningful examples in Gherkin. The hard work is in writing the tests themselves. I think there's also some work to be done around orchestration - setting up a number of machines, waiting for certain machines to be ready, and then running the tests. This kind of endeavour requires some thought and some engineering, and is the space where I'd expect to see most innovation.
About the Book Author
Stephen Nelson-Smith (@LordCope) is principal consultant at Atalanta Systems, a fast-growing agile infrastructure consultancy, and Opscode training and solutions partner in Europe. One of the foundational members of the emerging Devops movement, he has been implementing configuration management and automation systems for five years for clients ranging from Sony, the UK government and Mercado Libre to startups amongst the burgeoning London 'Silicon Roundabout' community. A UNIX sysadmin, Ruby and Python programmer, and lean and agile practitioner, his professional passion is ensuring operations teams deliver value to the business. He is the author of a popular blog, and lives in Hampshire, UK, where he enjoys outdoor pursuits, his family, reading, and opera.