Proper implementation of feature toggles based on a categorisation of their longevity and dynamism helps dealing with their operational complexity according to Pete Hodgson, consultant at ThoughtWorks. In his post he expands Martin Fowler's FeatureToggle pattern and proposes the release, ops, experiment and permission toggle categories and implementation strategies. In large projects, the difficulty to continually deliver releases calls for feature toggles.
"Feature toggles are a powerful technique, allowing teams to modify system behaviour without changing code" Pete writes. He shows a simple application of the dynamic feature toggle:
function reticulateSplines() {
if( featureIsEnabled("use-new-SR-algorithm") ) {
return enhancedSplineReticulation();
} else {
return oldFashionedSplineReticulation();
}
}
In the example above, the invocation of the featureIsEnabled()
method is what Pete calls the decision point or toggle point, while the method implements a toggle router, often based on a simple configuration file (toggle configuration). Advanced feature toggles might take the toggle context into account and implement "the concept of user cohorts - groups of users who consistently experience a feature as always being On or Off".
Source: http://martinfowler.com/articles/feature-toggles.html
The author suggests these different toggle types:
- Release toggles decouple deployment from releasing features and are temporary in nature. They are often used for canary rollout strategies. "Product Managers may also use a product-centric version of this same approach to prevent half-complete product features from being exposed to their end users." They can be used as an alternative to feature branches, where merging of features into a release has proved to be a complicated task too late in the process.
- Ops toggles are a generalisation of the circuit breaker pattern that is known from its Netflix Hystrix implementation. These toggles modify the operational aspects of the system and can be used to degrade the service under high load conditions gracefully.
- Experiment toggles are short-lived toggles that can be used for marketing purposes in A/B testing. In this scenario "each user of the system is placed into a cohort and at runtime the toggle router will consistently send a given user down one codepath or the other, based upon which cohort they are in." By evaluating the aggregate behaviour of different cohorts the effect of the feature can be assessed.
- Permission toggles are mostly long-lived optional feature switches that can be applied for implementing pricing strategies such as a freemium model and for silver, gold or platinum product tiers. These toggles warrant a more robust implementation than simple if/then/else statements to improve maintainability.
The diagram below shows these categories plotted on the longevity and dynamism dimensions:
Source: http://martinfowler.com/articles/feature-toggles.html
Martin Fowler additionally separates feature toggles build-time vs. run-time. Build-time toggles can be used as a static alternative to run-time Release Toggles, whereby the decision on which new features make it into a release are made during compilation. In Agile Release Train variants the release builds follow a fixed time schedule. Tested features may board the train, while unfinished or unstable features are opted out during build-time. Other alternative approaches to achieve decoupling might use canary deploying microservices.
"Feature Toggling has a tendency to become more and more prevalent in a system over time" Pete contends and he proposes the following principles when implementing feature toggles:
- Avoid conditionals at the toggle point by applying the Strategy pattern and thereby encapsulating if/then/else statements into the routing layer.
- De-couple decision points from decision logic by introducing a decision object that implements that logic. In that way for example a change in feature grouping does not break the code.
- De-couple your code from the feature toggling infrastructure by injecting the decision object as a constructor parameter so that any method that you want to toggle can use it. In that way you can design your code to be unaware of the toggle router and thereby simplify development and testing.
Pete goes on to discuss the different toggle configuration options. While some release toggles can be enabled or disabled at application startup others need a reconfiguration at runtime, such as ops toggles, while experiment and permission toggles act per user session or even a single request. The author recommends to use static configuration as the preferred mode for feature toggle configuration, since when done correctly as configuration-as-code, it can be versioned and otherwise treated as code (review, deploy, etc.). For advanced settings, he advises to use distributed key-value stores such as Consul, etcd, or Zookeeper.
Feature toggles are gaining importance with continuous delivery pipelines in large projects as they help decouple deploying from releasing. Coordinating branch merging and releasing features to common environments in parallel teams leads to serialisation of tasks and therefore decreased velocity. By removing the all or nothing release strategy, feature toggles help regain the necessary speed, albeit at the a cost. Apart from increased operational complexity, additional risks arise from using feature toggles intensively, specifically when used as release toggles to mask unfinished code. According to Jim Bird, feature toggles make "code more fragile and brittle, harder to test, harder to understand and maintain, harder to support, and less secure." The main argument is that bringing untested code into production that might be exposed accidentally is a bad idea. He cites a business failure at a financial institution as an example for such a situation. "Feature toggles require a robust engineering process, solid technical design and a mature toggle life-cycle management. Without these 3 key considerations, use of feature toggles can be counter-productive."
These risks aside, a recent paper from Facebook explores the use of feature toggles in practice at web scale. The Facebook implementation is based heavily on feature toggles. Going beyond configuration-based toggles, a rule-based approach is adopted. A UI admin tool (Gatekeeper) is used to activate features by applying a flexible and selective filtering method to group user cohorts. In that way features can be exposed to select users based on demographics and other parameters: "initially Gatekeeper may only enable the product feature to the engineers developing the feature. Then Gatekeeper can enable the feature for an increasing percentage of Facebook employees, e.g., 1% to 10% to 100%. After successful internal testing, it can target 5% of the users from a specific region. Finally, the feature can be launched globally with an increasing coverage, e.g., 1% to 10% to 100%.", the authors of the paper explain.
Source: http://abhishek-tiwari.com/post/decoupling-deployment-and-release-feature-toggles
Pete's blog post is published in instalments with additional content planned.