Key takeaways
|
Infrastructure as Code (IaC) is the practice of specifying computing system configurations through code, automating system deployment, and managing the system configurations through traditional software engineering methods. For example, a server farm that contains numerous nodes with different hardware configurations and different software package requirements can be specified using configuration management languages such as Puppet, Chef, CFEngine, or Ansible and deployed automatically without human intervention.
The IaC paradigm brings the infrastructure, the code and the tools and services used to manage the infrastructure, in the purview of a software system. Therefore, IaC practices treat configuration code similar to the production code and apply traditional software engineering practices such as reviewing, testing, and versioning on configuration code as well.
A lot of work has been done to write maintainable code and achieve high design quality in traditional software engineering. Similar to production code, configuration code may also become unmaintainable if the changes to configuration code are made without diligence and care.
In this context, we define configuration smells and present a catalog of configuration smells in this article.
Configuration Smells
We define configuration smells as follows:
Configuration smells are the characteristics of a configuration program or script that violate the recommended best practices and potentially affect the program's quality in a negative way.
In traditional software engineering practices, bad smells are classified as implementation (or code) smells and design smells based on the granularity of the abstraction where the smell arises and affects. Similarly, configuration smells can also be classified as implementation configuration smells, design configuration smells, documentation configuration smells, and so on. In this article, we present two major categories of configuration smells namely implementation configuration smells and design configuration smells.
Implementation Configuration Smells
Implementation configuration smells are quality issues such as naming convention, style, formatting, and indentation in configuration code. Here, we present a list of implementation configuration smells with a brief description.
- Missing Default Case: A default case is missing in a case or selector statement.
- Inconsistent Naming Convention: The used naming convention deviates from the recommended naming convention.
- Complex Expression: A program contains a difficult to understand complex expression.
- Duplicate Entity: Duplicate hash keys or duplicate parameters present in the configuration code.
- Misplaced Attribute: Attribute placement within a resource or a class has not followed a recommended order (for example, mandatory attributes should be specified before the optional attributes).
- Improper Alignment: The code is not properly aligned (such as all the arrows in a resource declaration) or tabulation characters are used.
- Invalid Property Value: An invalid value of a property or attribute is used (such as a file mode specified using 3-digit octal value rather than 4-digit).
- Incomplete Tasks: The code has FIXME and TODO tags indicating incomplete tasks.
- Deprecated Statement Usage: The configuration code uses one of the deprecated statements (such as “import”).
- Improper Quote Usage: Single and double quotes are not used properly. For example, boolean values should not be quoted and variable names should not be used in single quoted strings.
- Long Statement: The code contains long statements (that typically do not fit in a screen).
- Incomplete Conditional: An “if..elsif” construct used without a terminating “else” clause.
- Unguarded Variable: A variable is not enclosed in braces when being interpolated in a string.
Let us consider a few examples of above mentioned implementation configuration smells in Puppet code.
1. if $version == ‘4.4’ or $version == ‘4.2’ or $version != ‘4.5’ or $version == ‘4.9’ or
$version == ‘5.0’{
2. case $::operatingsystem {
3. 'debian': {
4. apt::source { 'packages.dotdeb.org-repo.app':
5. location => 'http://repo.app.com/dotdeb/',
6. release => $::lsbdistcodename,
7. repos => 'all',
8. include_src => true
9. }
10. }
11. }
12. }
13. elsif $version in ['33', '3.3'] {
14. }
We can spot the following smells in the above snippet:
- Statement at line 1 has four logical operators that make it a complex expression. The statement is also suffering from long statement implementation smell.
- Statement at line 2 is a case statement without a default case which makes it an instance of missing default case smell.
- Statement at line 8 is not properly aligned. It is suffering from improper alignment smell.
- Statement at line 13 has an “elsif” statement without an “else”. This results in missing conditional implementation configuration smell.
We can refactor the above snippet in the following way:
1. if $version in ['44', '4.2', '4.9', '5.0'] or $version != '4.5'{
2. case $::operatingsystem {
3. 'debian': {
4. apt::source { 'packages.dotdeb.org-repo.app':
5. location => 'http://repo.app.com/dotdeb/',
6. release => $::lsbdistcodename,
7. repos => 'all',
8. include_src => true
9. }
10. }
11. default:{}
12. }
13. }
14. elsif $version in ['33', '3.3'] {
15. }
16. else {
17. }
Design Configuration Smells
Design configuration smells reveal quality issues in the module design or structure of a configuration project.
- Multifaceted Abstraction: Each abstraction (e.g. a resource, class, ‘define’, or module) should be designed to specify the properties of a single piece of software. In other words, each abstraction should follow single responsibility principle. An abstraction suffers from multifaceted abstraction when the elements of the abstraction are not cohesive.
The smell may occur in the following two forms:
- a resource (file, package, or service) declaration specifies attributes of more than one physical resources, or
- all the language elements declared in a class, ‘define’, or a module are not cohesive.
- Unnecessary Abstraction: A class, ‘define’, or module must contain declarations or statements specifying the properties of a desired system. An empty class, ‘define’, or module shows the presence of unnecessary abstraction smell and thus must be removed.
- Imperative Abstraction: Puppet is declarative in nature. The presence of imperative statements (such as “exec”) defies the purpose of the language. An abstraction containing numerous imperative statements suffers from imperative abstraction smell.
- Missing Abstraction: Resource declarations and statements are easy to use and reuse when they are encapsulated in an abstraction such as a class or ‘define’. A module suffers from the missing abstraction smell when resources and language elements are declared and used without encapsulating them in an abstraction.
- Insufficient Modularization: An abstraction suffers from this smell when it is large or complex and thus can be modularized further. This smell arises in following forms:
- if a file contains a declaration of more than one class or ‘define’, or
- if the size of a class declaration is large crossing a certain threshold, or
- the complexity of a class or ‘define’ is high.
- Duplicate Block: A duplicate block of statements more than a threshold indicates that probably a suitable abstraction definition is missing. Thus, a module containing such a duplicate block suffers from duplicate block smell.
- Broken Hierarchy: The use of inheritance must be limited to the same module. The smell occurs when the inheritance is used across namespaces (“is-a” relationship is not followed).
- Unstructured Module: Each module in a configuration repository must have a well-defined and consistent module structure. A recommended structure for a module is the following.
<Module name>
- manifests
- files
- templates
- lib
- facts.d
- examples
- spec
An ad-hoc structure of a repository suffers from unstructured module smell that impacts understandability and predictability of the repository.
- Dense Structure: This smell arises when a configuration code repository has excessive and dense dependencies without any particular structure.
- Deficient Encapsulation: This smell arises when a node definition or ENC (External Node Classifier) declares a set of global variables to be picked up by the included classes in the definition.
- Weakened Modularity: Each module must strive for high cohesion and low coupling. This smell arises when a module exhibits high coupling and low cohesion.
Let us consider a few examples of above mentioned design configuration smells.
Example 1: Consider the following code snippet (Puppet)
#Contents of file package.pp
1. class package::web {
2. …
3. }
4. class package::mail {
5. …
6. }
7. class package::environment {
8. …
9. }
10. class package::user {
11. …
12. }
The above snippet shows that a file contains definitions of four classes. Such multiple definitions stuffed in a file violates single responsibility principle and leads to multifaceted abstraction design configuration smell.
The snippet can be refactored by putting each class definition in separate files.
Example 2: Consider the following Puppet code snippet.
1. class web {
2. exec { ‘hadoop-yarn’:
3. …
4. }
5. exec { ‘apache-util-set’:
6. …
7. }
8. exec { ‘smail-invoke’:
9. …
10. }
11. exec { ‘postfix-set’:
12. …
13. }
14. }
Puppet is a declarative language and therefore, we should specify “what” we wish to achieve rather than “how”. However, generous usage of “exec” statements makes the code imperative and defies the purpose of the language. Such a code fragment suffers from imperative abstraction design configuration smell.
Such snippet can be refactored by specifying “what” aspects using Puppet’s language constructs to replace “exec”s.
Example 3: Consider a repository containing configuration code where the dependency graph (among the modules of the repository) forms a dense and complex structure. Such a repository is a mine-field of bugs due to its brittleness introduced by the dense and complex structure. In our catalog, we refer such repositories suffering from dense structure design configuration smell.
Striving for a Puppet repository where each module shows very few dependencies (if at all any) and is mostly independent could eliminate this smell.
Impact of smells
One could argue that the smells listed above are simplistic in nature and their impact on configuration project or people is trivial.
Not all smells are same when we consider the impact that they bring to a configuration project. In fact, individual smells may look trivial; however, when a project has a high volume of smells, their composite impact on maintainability could be huge. It is analogous to the popular story of a camel whose back gets broken by just one straw. It’s hard to believe that a straw can break the camel’s back. However, when the camel’s back carries a large bundle of straws and the master of the camel keeps putting additional straws one by one on the camel’s back, there comes a point when only a single straw breaks the back of the camel. Individual smells are similar to a straw that may cause a huge impact when considered collectively with other smells.
Tools to Detect Configuration Smells
To detect the above design configuration smells, we have developed Puppeteer which is an open-source tool written in Python. Puppet-Lint could be used to detect implementation configuration smells.
We carried out a detailed study in which we analyzed 4621 open source repositories containing Puppet code to explore interesting aspects associated with configuration smells. Find more details in the paper which has appeared in the proceedings of Mining Software Repositories conference 2016.
About the Authors
Tushar Sharma is currently a researcher at Athens University of Economics and Business. Earlier, he worked with Siemens Research and Technology Center, Bangalore, India for more than 7 years. The topics related to software design, refactoring, code and design quality, technical debt, change impact analysis and infrastructure as code (IaC) define his career interests. He has co-authored three books including "Refactoring for Software Design Smells: Managing Technical Debt". He is an IEEE senior member.
Marios Fragkoulis is pursuing his PhD on data representation, queries, and management supervised by Prof. Diomidis Spinellis at the Department of Management Science and Technology of the Athens University of Economics and Business. He holds an MSc with distinction in Computer Science from the Department of Computing of the Imperial College London and a BSc with Distinction from the Department of Management Science and Technology of the Athens University of Economics and Business. In 2015 he worked part time as an infrastructure operations engineer in OTE S.A. and as a software engineer in GRNET S.A. in Greece. He is the developer of the PiCO QL online data analytics system.
Diomidis Spinellis is a Professor in the Department of Management Science and Technology at the Athens University of Economics and Business, Greece. He has written two award-winning, widely-translated books: “Code Reading” and “Code Quality: The Open Source Perspective”. His new book “Effective Debugging: 66 Specific Ways to Debug Software and Systems” should be hitting the bookshelves by the time you read this statement. He holds an MEng in Software Engineering and a PhD in Computer Science, both from Imperial College London. Diomidis has served as an elected member of the IEEE Computer Society Board of Governors (2013–2015), and is a senior member of the ACM and the IEEE. From January 2015 he is serving as the Editor-in-Chief for IEEE Software.