The telecom industry has seen changes in service delivery processes in recent years via the adoption of DevOps principles and tools. Ericsson’s talk at the DevOps Enterprise Summit 2017 in London and their and continuous delivery paper outline the challenges they faced and how they overcame them.
The hurdles faced by telecom systems vendors in deploying their systems are unique in terms of scale, regulatory restrictions, robustness and availability requirements. Ericsson moved from 7 weeks to test, 6 months to deploy and 2-3 years for development to a model where they spent only 90 minutes to test, 3 weeks to deploy and 6 week development cycles. Any new release of the telecom software would be rolled out to multiple network operators. They had initial hiccups in adopting the newer practices but over time things improved. The time between the general availability of a release and making it live on an operator’s node gradually decreased. The feature release cadence is monthly, while network operators might choose to deploy on an either monthly or a quarterly basis.
Image Courtesy - http://cloudpages.ericsson.com/continuous-delivery
The development model was the first thing to change - from multiple parallel release chains to a single track. The first product to adopt this change was the Serving GPRS Support Node – Mobility Management Entity software (SGSN-MME), followed by the Evolved Packet Gateway product. The evolved packet core is a telecom framework that aims to provide unified voice and data services on a 4G network, as opposed to separate packet switched and circuit switched elements for data and voice respectively.
The transformation process began in 2009. It started with process changes - like smaller cross-functional teams, assignment of product owners and instituting scrum masters. Lean processes were put in place. Metrics like the frequency of code commits were measured to gauge their effectiveness. However, this led to a misuse of these metrics. After some of these changes showed promise, there was leadership pressure to increase the number of teams. This led to further problems, including rapidly growing teams and underestimation of the platform changes required. The platform was not cloud ready which contributed to this aspect. The development environments and CI practices were also immature, coupled with immature program structures - which led to bad visibility into the teams’ progress. The combined velocity of the teams fell below what it was previously. In 2015 a stock-taking exercise revealed these problems.
Some changes were undertaken to resolve these problems. Priority was given to quality over speed, with a focus on quality acceptance tests. A new process was put in place to onboard new teams and members. On the tools front, the teams moved to using virtualization using Kernel Virtual Machine (KVM), which cut their upgrade times from 22 to 3 hours. KVM is a framework that runs as a hypervisor inside the Linux kernel and forms a key part of Ericsson’s cloud platform. Continuous integration frameworks were adopted, with some Docker based ones. A centralized hardware allocation model in which resources were allocated on request was adopted. This led to easier management and better utilization of the overall hardware. Organizational changes included project management practices, better planning, feature teams , daily stand up meetings as well as a coaching (mentoring) network to share knowledge.
Other products inside Ericsson have also adopted the CD model. This is part of a wider trend in the telecom industry, where traditional service delivery practices have been giving way to DevOps. This is also driven by the software virtualization of network service functions (NFV) that were previously hardware based. This has made it easier to adopt DevOps tools and practices since more and more functions are moving to software implementations.