Configuration management is the foundation that makes modern infrastructure possible. Tools that enable configuration management are required in the toolbox of any operations team, and many development teams as well. Although all the tools aim to solve the same basic set of problems, they adhere to different visions and exhibit different characteristics. The issue is how to choose the tool that best fits each organization's scenarios.
This InfoQ article is part of a series that aims to introduce some of the configuration tools on the market, the principles behind each one and what makes them stand out from each other. You can subscribe to notifications about new articles in the series here.
Ansible started as an open source project in February of 2012 with the goal of automating multi-tier IT configurations and processes without relying on management agents to be installed on the remote systems. As it’s grown in use over the last two plus years, it is worth exploring some of the design choices and concepts presented - and why it works the way it does.
Foremost, it’s important to understand that Ansible is a general-purpose IT automation system, but does not wish to be considered exclusively a configuration management system. This is because, for many of our users, the more interesting part of the equation is in how business applications are deployed on top of the OS configuration - or how upgrade processes and ad-hoc server maintenance processes are orchestrated.
While Ansible does contain a robust series of modules for Configuration Management tasks (as with CFEngine, Puppet, or Chef), Ansible is also concerned with cloud provisioning (AWS, Rackspace, Google, OpenStack, etc), software deployment (in the same vein as Fabric or Capistrano), and automation of orchestration procedures like zero downtime rolling upgrades.
Ansible is able to apply itself to all of these different tasks more easily because rather than modeling rigorous configurations of single hosts, it is designed more around modeling services (which may span hosts) or arbitrary processes that a user would like to automate.
As such, very step wise processes that may require multiple steps to be executed on different hosts can fit into Ansible’s domain.
Ansible approaches management by not requiring any daemons to be installed on the remote systems, instead connecting to remote machines over OpenSSH. Ansible does not just execute shell commands over SSH, but uses it as a transport medium, transferring modules from a control machine to the remote managed machines.
These modules describe how to get a given resource from the current state of an environment to another, but can also break down into arbitrary commands when needed.
Ansible modules can be written in any dynamic language, and typically consume and emit JSON. as such, while Ansible is implemented in Python, it’s possible to extend Ansible with modules written in Ruby, Perl, or even Bash. The core of Ansible contains 235 some modules, which are all, by community convention, implemented in Python so they can be more easily maintained by the developer community as a whole.
Design Goals: Simplicity, Ease of Use, and Maximum Security
Ultimately, there are a lot of automation options out there, and it’s important for any user to evaluate them and choose the tool that best fits their mentality and approach to working.
Ansible focuses around ease of use, ease of development/change of automation content, and IT security.
Ease of use to Ansible is mostly around keeping syntax human readable, with the goal that a user not trained in Ansible should be able to read and understand an Ansible playbook, even to the point of understanding how it would configure a system. Ease of use is also magnified by not having to manage any remote daemons - with some other approaches, if a daemon is not installed or not functioning, a remote system cannot be managed. Similarly, such a daemon may consume too many resources on the remote node that are needed for other computing tasks.
Ease of development and change in automation content is somewhat misunderstood. Frequently the discussion is one of “low learning curve”, though ultimately the most important cost is day to day time. Ansible was written to minimize day to day time spent with the automation software and increase time spendable for development on core business applications or other strategic IT projects.
Finally, security is a key goal. Ansible uses OpenSSH as a transport because it is very widely peer reviewed, and does not create an extra system where keys must be managed. Benefitting from extreme levels of peer review and widespread use, in the event of OpenSSH having security vulnerabilities, it is patched rapidly.
Ansible uses OS native security credentials, so it works with su, sudo, Kerberos, passwords, keys, identity management software, and so on.
Example of a Zero Downtime Rolling Update with a LAMP Stack
To understand how Ansible can describe an arbitrary IT workflow processs, it’s helpful to see an example. Ansible calls it’s automation language playbooks, and one of the most fundamental shows how to deploy a LAMP stack that is using HAProxy as a load balancer. HAProxy is chosen here because is freely available, but Ansible also contains modules for working with load balancers such as Citrix Netscalers or F5 BigIP devices, as well as various IaaS-based load balancers.
The basic configuration of the system can be described as a simple list of hosts, separately managed in an inventory file. An example inventory file might look like this:
[webservers] www01.example.com www02.example.com [databases] db01.example.com db02.example.com
Ansible doesn’t have to use a static inventory file -- it can also dynamically pull inventory from various providers, such as EC2, with each tag name in a cloud automatically becoming available as a group. For this example however, we’ll just show something basic:
--- - hosts: all roles: - common - hosts: dbservers roles: - db - hosts: webservers roles: - base-apache - web - hosts: lbservers roles: - haproxy - hosts: monitoring roles: - base-apache - nagios
In the above example, at each “- hosts” line, Ansible is saying “talk to all the hosts in this group, and apply the selected roles to them“.
As such, it’s evident that Ansible descriptions of infrastructure can be very simple. Roles are abstractions around more detailed instructions in Ansible.
For instance, the “web” role is pretty simple, and contains a task file that looks like:
--- - name: Install php and git yum: name={{ item }} state=present with_items: - php - php-mysql - git - name: Configure SELinux seboolean: name=httpd_can_network_connect_db state=true persistent=yes when: sestatus.rc != 0 - name: Copy the code from repository git: repo={{ repository }} version={{ webapp_version }} dest=/var/www/html/
As shown above, various packages are installed with the yum package manager (if not already installed), SELinux security booleans are set (if so required), and software is checked out from a git repository.
The full source of this example is located on GitHub and you can see how it can be extended to work with HAProxy .
Why Push Based Automation Works: Declarative Resource Models and Optimizing The Implementation
Ansible is by default a push-based system that uses SSH for transport.
Many earlier configuration management systems were pull-based, where nodes would wake up periodically and check in with a central server.
In this model, it is often difficult to orchestrate a change that requires updating one set of servers before another, and is further complicated when doing things on other servers on behalf of another. Ansible makes this easier by being very explicit, pushing changes out to specific nodes, and making it possible to describe the steps one would update on one’s infrastructure, just as if telling a human to perform those steps.
The pull based methods are valid, and in some cases have some other interesting properties, but frequently it is assumed that SSH push-based either (A) is inefficient or (B) that Ansible is merely being “SSH in a for loop”, which is not correct.
While being push-based, Ansible keeps the the idea of declarative models of system state - asking systems to fulfill a given property rather than blindly running commands, but also recognizing that sometimes blindly running commands is required. Ansible also makes it possible to save the result of commands into variables and use the results of those variables to make decisions during the automation run.
There are a lot of tuning options in the SSH implementation, including the option to use ControlPersist to reuse SSH connections (which is a native SSH feature that will keep connections open for up to N minutes per host), the ability to “pipeline” operations to reduce file transfer, and full support for non-root operation, including the usage of sudo or su as appropriate. If on older Enterprise Linux platforms, an “accelerated mode” can login to remote systems over SSH and set up a higher speed secure transport that will expire after 30 minutes of non-use. Finally, Ansible makes use of python’s multiprocessing library, able to talk to many systems in parallel to increase speed of simultaneous configuration of larger infrastructures. In rolling updates for continuous deployment, you’ll seldom need this, but for running ad-hoc commands or batch reconfigurations, it’s quite handy to be able to spin up a very large number of machines in parallel.
Also it’s worth pointing out that while it can configure remote systems by SSH, it can also talk to web services and other APIs or even ask humans for input. Examples include the “uri” module for REST requests, or the cloud modules, which communicate with services including Amazon AWS, Rackspace, GCE, or OpenStack.
Ansible also features a “check mode” where it can run a playbook against a set of remote systems and report whether it would have needed to make any changes on the remotes to fulfill the playbook, all without making any changes. Using this method, it is possible to detect if systems may have drifted away from a previously configured state.
All of these properties together enable a model that is not only suited for classical configuration management, but also orchestration of higher level configurations, like the rolling update example above, or also application deployment scenarios such as deploying from a tarball, git repository, or other system on top of a pre-configured OS or cloud instance.
Enabling Community Contribution
Ansible is designed around being a very wide-scale open source project as it believes very much in connecting brilliant developers and systems administrators so they can share knowledge and common tooling.
To do this, it aims to be pluggable not only in terms of modules, but also various other plugins - including inventory sources to pull from different IaaS providers and management software, or even callback plugins to feed output into external systems like team chat servers.
Most notably, Ansible features a “batteries included” approach to module development. Ansible modules are low-level building blocks, things like “service”, package management modules, “user” and “group” modules, all the way to modules to provision new cloud instances. Rather than have users find what modules are the best among a community commons, Ansible encourages contribution of these modules to the core standard library. The result of this is that users picking up the program have some 235 modules, all community maintained, which they can use to start assembling their infrastructure automation. Module contribution is maximized by keeping source code straightforward, and making sure there are good resources about how (and when) to write new modules.
Content using the modules are usually grouped into “roles”, as shown above. Roles can be freely shared here, which is powered by GitHub repositories.
Ansible is freely available to download. To find out how to try Ansible and about other features contained in it, visit the documentation site.
About the Author
Michael DeHaan is the CTO of Ansible, Inc, and creator of popular DevOps-friendly automation systems Cobbler and Ansible. Prior to Ansible, Michael helped build and define systems management software for such companies as IBM, Adaptec, Red Hat (where he built Cobbler as part of the Emerging Technologies group), and Puppet Labs. Michael lives in Morrisville, NC and can be found online and on Twitter.
Configuration management is the foundation that makes modern infrastructure possible. Tools that enable configuration management are required in the toolbox of any operations team, and many development teams as well. Although all the tools aim to solve the same basic set of problems, they adhere to different visions and exhibit different characteristics. The issue is how to choose the tool that best fits each organization's scenarios.
This InfoQ article is part of a series that aims to introduce some of the configuration tools on the market, the principles behind each one and what makes them stand out from each other. You can subscribe to notifications about new articles in the series here.