At QCon New York 2015, Mitchell Hashimoto discussed how tools such as a HashiCorp"s Terraform and Consul could be utilised to orchestrate the infrastructure provisioning and application lifecycle management of cloud and container-based applications, with the ultimate goal of safely delivering software systems at scale.
Hashimoto, co-founder of HashiCorp and creator of Vagrant and Packer, began the talk by stating that the ultimate goal of orchestration in the context of software systems is to safely delivery applications at scale. Container technology solves the problems of application packaging, image storage and execution. For example, with Docker these issues are solved with Docker images, Docker registry and the Docker daemon respectively. However, other challenges remain, such as infrastructure lifecycle and provisioning, monitoring, service discovery and configuration, security, and application lifecycle management.
Hashimoto introduced Terraform, which enables developers and operators to "build, combine and launch infrastructure safely and efficiently". Terraform utilises configuration files written in Hashicorp Configuration Language (HCL), which is fully JSON compatible and inspired by libucl, to create infrastructure with code. Servers, containers, load balancers, databases etc., can be defined and applied with a single command. Preview changes to infrastructure can also be generated and "diffs" compared, discussed and pull requested, much like a typical developer workflow that utilises a DVCS like Git.
Service discovery and configuration can be handled with Consul. Hashimoto discussed that questions such as "where is service foo?", "what is the health status of the machine foo?" and "what is the configuration of service foo?" can all be answered with Consul. Service discovery is provided by an HTTP API and "legacy-friendly" DNS, and both methodologies provide service health status (or do not return unhealthy nodes).
Consul also provides a key/value store for the highly available storage of configuration that allows operators to "turn knobs without big configuration management process". Although local by default, Consul can be configured to be multi-datacenter aware. Hashimoto suggested that Consul can be run in or outside of a container, and the speed and scalability of Consul makes this tool ready for the 'container scale' of future applications and the challenges this presents.
InfoQ caught up with Mitchell Hashimoto after his talk and asked questions about Terraform, Consul and container technology:
InfoQ: Hi Mitchell, and thank you for agreeing to answer questions for the InfoQ community. Firstly, could you briefly introduce yourself and Terraform?
Hashimoto: Sure, I'm perhaps most broadly known as the creator of Vagrant. Within the ops industry, however, I'm better known as the creator of Packer, Serf, Consul, Terraform, and Vault. All of these tools are in use by the largest websites in the world at massive scale, and I'm proud to have played a part in their creation. I describe myself as "automation obsessed:" I love to take tasks and teach computers to do them. I'm doing this now in the DevOps field but I've also done it in the past in various other areas such as education, communities (forums), and gaming.
Terraform is a tool we came out with last year. It can be described as a tool for describing and building infrastructure, both initially and over time. If you're just coming into Terraform, you can compare it to something like AWS CloudFormation. But as you dive deeper into Terraform, you'll come to discover that it is much more powerful. Terraform is able to create infrastructure on any cloud platform and combine those together. For example, you can create a AWS EC2 instance, then use the IP address of that instance to configure a DNS record in CloudFlare.
But that is all just the beginning... I recommend taking a look at terraform.io!
InfoQ: There are often questions about the role that Terraform will play within datacenter/infrastructure automation, particularly in relation to Ansible (and even Puppet and Chef etc), which provide overlapping functionality. Could you provide any guidance to this, and the ideal Terraform use cases?
Hashimoto: I have to apologize that I've never used Ansible full time, so I can't directly compare with confidence. However, I can provide guidance in relation to Chef, Puppet...
Configuration management tools such as Chef/Puppet can't compare to Terraform. There is a slight overlap in that they can manage the creation of initial resources (starting some servers, connecting them to a load balancer, etc.), but even that breaks down very quickly. Terraform has two main advantages at a core engine level vs. a configuration management tool. First, Terraform allows you to reference the attribute of any resource within any other resource. This allows you to start an AWS instance, then take its IP address and configure a DNS record. Chef and Puppet both have support for variables, but not arbitrary attribute access. This becomes important because things such as IP address aren't available ahead-of-time! In Terraform verbiage, we call it a "computed" attribute; you can only realize the value at runtime.
The other advantage Terraform has is more complex lifecycle management of resources. For example, in both Chef/Puppet you can't easily say: create the new resource before you delete the old. That is a very basic lifecycle feature that Terraform has. The importance in this is obvious: if you have a server connected to a load balancer, and you make an update that requires a new server to be created (for example changing the base image), then you want to create the new server, atomically update the load balancer, and _then_ destroy the old server. This sort of logic is hard to represent in Chef/Puppet.
And, for fun but in all seriousness, if you delete a resource from a Terraform config, Terraform will delete that resource. Puppet/Chef require you to explicitly mark a resource as deleted and won't detect you deleting it from the config. Amusingly, when I make this note in many customer meetings, I've gotten cheers!
InfoQ: Vendor/provider-specific infrastructure tools such as AWS CloudFormation are tuned towards their target platform, and offer fine-grained control of configuration. Are you concerned that a generic tool such as Terraform may not able to compete with these tools if an organisation utilises (or relies heavily upon) only a single vendor?
Hashimoto: We're not concerned about this at all.
We hired a single person full time to work on AWS support for Terraform. Within 1 month, we had coverage over almost all the requested resources from users. And, just last month (his 2nd month working with us), we've added multiple features to Terraform before CloudFormation even supports them. That is usually enough to quell the concern. But to add a bit more: it is so simple to write a Terraform resource, that we've had a huge increase in community contributions that help make this possible. I think over time, we'll be ahead of everyone else in "time-to-support-new-feature."
This isn't just an AWS phenomenon: we have support for yet unreleased features of Google Cloud as well. In general, we work closely with these vendors to make sure we can support their platforms right away. It is a win-win: yes, they perhaps lose users from their vendor-specific tooling, but they gain paying customers onto their platform, and we gain users.
InfoQ: When Terraform doesn"t support a specific part of a provider"s API (e.g. configuring cipher support on SSL load balancers), users have to make these changes outside of Terraform. What are the implications when Terraform is run after these changes? What also will happen when future versions of Terraform add support for these features?
Hashimoto: If Terraform isn't managing a property (such as ciphers on SSL load balancers), it also won't view any changes there as drift, and it won't affect Terraform runs.
When Terraform supports that property, as long as it matches the config, Terraform won't view it as drift and will make no changes.
Simple, I believe.
InfoQ: Infrastructure vendor/provider APIs and SDKs do evolve and change over time. What are the plans for Terraform to maintain backward compatibility, and also to add support for new vendor features?
Hashimoto: Infrastructure providers are quite good at API compatibility. For an example, just look at AWS, which still supports API protocols back to 2006! Because of this, the concern isn't at the API level but more generally at the library level. However, since Terraform is written in Go, already released binaries aren't affected by this since the library is statically compiled. But we've had cases where the master branch of Terraform suddenly stops compiling due to upstream library changes. This is annoying, but we're planning on combating this with the future vendoring support coming into Go.
As for new features, every major infrastructure provider has now moved to fully auto-generated SDKs. The SDKs generally support the features as soon as they're available, allowing us to easily plug them in!
InfoQ: As Terraform is not yet released as General Available, how much backward compatibility will be maintained with the current commands and resulting configuration files (e.g. the ".tfstate" file) before we reach a v1.0 release?
Hashimoto: Very, very large companies are betting on Terraform. We'll be sure to offer easy migration paths forward towards Terraform 1.0. Up to this point, we've auto-migrated. The state contains a "version" field and we use that to version the structure.
InfoQ: Currently Terraform requires a ".tfstate" file to exist, which contains the infrastructure state resulting from the previous run, before an update can be performed. Do you have any timeline on when Terraform will be able to build a state file simply by querying a provider?
Hashimoto: "Soon." Before 1.0, probably long before 1.0. If I were to make a non-promising guess, I'd say 0.8.
The data in an infrastructure provider is strictly less than the full configuration of Terraform, so this process will be lossy. But it is still valuable to be able to build as much of the config as possible. So we're going to do that.
InfoQ: In the latest release of Terraform 0.5, multi-provider support has been added, which will allow users to define their infrastructure across multiple providers in a single file (for example, multi-region in AWS). Do you intend to provide additional tooling for managing what could be a large amount of global infrastructure in a single text file?
Hashimoto: Terraform will load any *.tf files in a single directory, so you're able to split that up. Plus, you can use modules to further split that up as well as parameterize it.
The goal of Terraform is to give you enough options so it isn't overwhelming complexity here.
InfoQ: You have recently announced a new conference, "Hashiconf", focused on topics within automating the modern datacenter. Can you tell us more about this please? Are you trying to extend the community around HashiCorp products, or perhaps provide a forum for like-minded developers and operators to share ideas?
Hashimoto: We have a very large community! We've been asked a lot in the past year to create a HashiCorp community event to bring like-minded people together who have an interest in our tools so we can all share knowledge. This is HashiConf: a conference focused on HashiCorp tooling and its interaction in the real world and with other projects. For example, the talks won't be "how to use HashiCorp product X", they're more generally "How company X used product Y with non-HashiCorp project Z".
In addition to this, we'll have hundreds of HashiCorp's most passionate users in one place, so we'll be sure to reward them. :)
InfoQ: Thanks once again for taking the time to answer these questions Mitchell. Is there anything else you would like to share with the InfoQ audience?
Hashimoto: I think that's all! Thanks for having me and I enjoy answering these. If anyone wants to contact me further, just find my Twitter, which has my email.
More information about Mitchel Hashimoto's QCon Talk "Orchestrating Containers with Terraform and Consul" can be found on the conference website.
InfoQ would like to express appreciation to Bart Spaans, a Terraform expert and contributor to the Google discussion group, for his help in creating several of the questions above.