Introduction
As technologists, we're constantly bombarded with new paradigms, frameworks, and tools to achieve our goals. When adopting any new technology we need to understand its implications in the operational dimension as well as development dimension: are we really simplifying overall or are we simply trading one set of problems for another?
Hypervisors enabled virtual machines (VMs), which we created as though memory, compute, network and storage grew on trees. Then came a suite of management tools to help us shepherd our VMs and gain the most value from this disruptive advance. Docker has popularized containers and once again we are facing a proliferation of operational components that is driving the need for improved management tooling.
Amazon's AWS Lambda service, along with other serverless offerings, have recently been receiving interest due to their simplicity, capabilities and potential for cost reductions.
With serverless models, we are trading the management of less granular components like virtual machines, containers or complete applications for the management of very granular functions. If your operations team is geared and tooled to manage 100 applications, will they be ready to manage the 500 to 1,000 entities that may result from the functional decomposition?
What is AWS Lambda?
Ignoring the possible Computer Science reference to Lambda1, which may or may not be intended by those responsible for naming within Amazon, AWS Lambda is a service for running your functions, expressed in a variety of programming languages, with complete disregard for any web server, application server or framework scaffolding let alone the notion of some kind of computer to execute in. Simply write your function, log into your AWS account and deploy your function into the AWS Lambda service and you are done. Your function is ready to run—now you just need to decide when it should run.
Before I talk about the execution of your lambda function, here is a complete implementation of a function written in the Node.JS programming language for deployment in AWS Lambda:
exports.myHandler = function(event,context) {
context.succeed(‘hello’);
}
Typically, even with the declarative power of Node.JS, there would be around 20 more lines of code before this function could be exposed and executed as a service. AWS Lambda has certainly simplified our implementation. What about our operations?
Some refer to this as stateless computing or serverless computing. Personally I prefer the second term, as there is clearly a state somewhere-probably in a database service that the function may leverage— but the function itself is essentially stateless. The same argument could be held against the serverless term, clearly there are servers floating around in the cloudy background but their existence is implicit and automatic rather than explicit and manual.
The next area of value in AWS Lambda stems from the ability to easily associate your function with all manner of triggers via both web-based and command line tools. There are more than 20 different triggers that can be used—most of them being from other AWS services such as S3, Kinesis and DynamoDB—including the ability to associate your function with a URL that can be exposed externally via an API Gateway. Simply browse the available event sources and integration points, pick the one or more you want to associate with your function and the integration is complete. Your function will now execute automatically in response to, for example, some new data being posted into the selected S3 bucket.
To summarize:
- AWS Lambda Functions are easy to build and deploy.
- You only pay for resources used in the execution of your Lambda Function, not for servers to standby and be ready to execute your function. This should yield cost savings where you have highly variable load.
- Automatic scaling of the underlying compute resources falls out as part of the approach. In fact, it isn’t even really a question or consideration from a technical perspective.
- Integration with other AWS services is trivial as is exposing your function for direct invocation.
- You are trading management of fewer, coarser-grained components for the management of an increased number of finer-grained Lambda Functions.
To Function or not to Function?
So this all sounds amazing—write a function, deploy it in AWS Lambda, connect it up, let scaling happen automatically and only pay for the resources used in its execution.
However, AWS Lambda isn’t a silver bullet, it needs to be considered as just another tool in our toolbox and we need to avoid seeing nails everywhere so we can wield our shiny new hammer shaped tool. When selecting the right tool from our toolbox, we need to consider the development and operational dimensions. Just because it is easier to develop and possibly cheaper to deploy, doesn’t mean it is easier to manage or even practical to manage at scale.
Functions are best used where there is a simple transformation to be performed. For example, and this is an example that Amazon themselves use frequently, creating a thumbnail image based on a document loaded into an S3 bucket. Another example would be sending an SMS message when a data stream (maybe from Amazon Kinesis) shows some values falling outside of a normal expected region—an alarm!
Both of these examples share common traits.
Both are stateless, that is information comes in and is used to create new information (thumbnail or alarm), with no need to refer to anything else beyond the input.
Additionally, these examples require limited error handling. If a thumbnail can’t be created, then the thumbnail will not be created or a stock thumbnail might be output instead. Similarly, the alarm represents the error case, either there is an alarm to raise or there isn’t. There isn’t really a third case where the input stream is in error that needs to be reported in some way beyond the creation of the alarm2.
Finally, there is little control flow or conditionality that leaks outside of these functions. The data driven behavior—if-then-else—resides within the function.
If your requirements are similar then coding them as a function for deployment in AWS Lambda has potential. However, not everything fits this model and there is a need for state, error handling and complex conditional choreography.
Some would argue that you can always break down a larger problem into small pieces. And this is certainly true, but there is a balance to be struck. Decomposition down to multiple small functions increases the need for inter-function plumbing, expands the surface area—which many consider to be a security vulnerability—and places greater load on operations and management let alone the cognitive load required to comprehend, for maintenance and change, a complex interwoven web of distributed functions.
How will Enterprise Organizations Adopt Cloud Functions?
The majority of companies we work with fall into the pre-existing, Global 2000 range rather than the new digital economy and startups and as such I haven’t heard them beating the cloud function drum too much. However this is starting to change. At the moment I would say about 10% of these companies have expressed either a direct interest in Cloud Function architectures and platforms or architectures supporting pay per transaction and the thing to note is that this percentage is increasing.
Aside from the technical sweet spot for cloud functions being relatively small, larger more mature organizations typically have a lot of regulatory and compliance obligations from a mix of external and internal bodies. Cloud Functions, along with microservices-based architectures and agile software development practices are somewhat at odds with the governance desires of these organizations with the result of an impedance mismatch. We can create new software and modify existing software much quicker these days and certainly quicker than we can push it through an established software development lifecycle that has a significant governance dimension.
To be able to leverage Cloud Functions at scale we need to expand their technical sweet spot and appreciate the need for checks and balances that enable Managers, Operators, CIOs, CSOs and Risk & Compliance Officers to sleep soundly at night.
On the technical sweet spot front, we need to expand the error handling facilities of Cloud Functions to make sure that more than just trivial transformation functions can be componentized this way. Being able to trap and expose error conditions in a robust manner then falls neatly into the next area that needs addressing, flow of control and complex choreography. Cloud Functions need to be able to be wired together in a manner that doesn’t just shift the pain from one programming language to another, or from a programming language to a user interface that requires 20 steps to capture the plumbing and then makes it hard to see what you've plumbed afterwards. There needs to be a net improvement in productivity in both the creation and comprehension dimensions.
With respect to governance, just because it is a Cloud Function doesn’t mean that it should escape controls over which language, runtime and libraries it is using nor how much compute, network and IO it is consuming, or what region it is executing in with respect to the data services it might be using (think Safe Harbor etc.) to name just a few of the issues that worry operations teams and managers.
The Only Constant is Change
As mentioned earlier, Cloud Functions should be one of many tools in our toolbox. Whilst it might be the most recent architectural tool, we can be sure that it will not be the last.
Architectural Patterns
The “Architecture Patterns’ (figure 1) above provides an indicative position of the different architecture choices against the axis of development/operational overhead v. statelessness. As with all attempts at categorization there are exceptions but this is broadly correct.
On the vertical axis, Cloud Functions are absolutely, and forced by the execution framework, intrinsically stateless whereas the other patterns can all move freely between stateful and stateless although there are strong preferences that would be tough, or an anti-pattern, to deviate from.
The horizontal axis illustrates the relative effort/overhead in the development and operational dimensions. At one end, monolithic architectures have a lot of development inertia; things that get in the way and slow down initial creation and subsequent alteration of a solution built using this approach. Operationally, monolithic architectures are as simple as can be, one large program that has to be run, secured and monitored. At the other extreme, Cloud Functions are arguably the simplest expression of reusable, and cloud accessible, algorithms from a development perspective. However, a complete solution is made from many Cloud Functions (and more than likely other non-Cloud Function components) that each need to be installed, updated, secured, scaled, governed and monitored.
Not many people would set out today to develop a solution based on a monolithic architecture principle, but you never know considering the right tool for the right job approach. Tiered architectures, 12 factor ‘Cloud Native Applications’, microservice architectures and their logical conclusion of Cloud Functions are all modern options. While architects and engineers select the right approach to solve problems they face today, they must also address how requirements will change over time. An added pressure is talent retention and recruitment and the technologies being used within an organization can have a big impact on both of these. It isn’t easy to recruit people to work on yesterday’s technology - even if yesterday is part of today and part of the foreseeable future.
The only constant is change and we cannot be prescriptive about how software should be developed. To that end, we should use platforms and tools that support all of the architecture models mentioned above including a capacity to adapt to embrace the future—as far as we can envisage it anyway.
Conclusion
Gartner predicts that, by 2020, more than 50 percent of all new applications developed on PaaS will be IoT-centric, disrupting conventional architecture practices.
"IoT adoption will drive additional use of PaaS to implement IoT-centric business applications built around event-driven architecture and IoT data, instead of business applications built around traditional master data," said Benoit Lheureux, research vice president at Gartner. "New IoT-centric business applications will drive a transformation in application design practices that focus on real-time contextually rich decisions, event-analysis, lightweight workflow, and broad access to Web-scale data3".
Delivering enterprise-scale solutions out of Cloud Functions can be difficult. Technologies change and the universe of services, devices, frameworks, business expectations and their associated risks are expanding at an increasing rate. Its exciting to be at a point in our industry where tooling is evolving just as rapidly, enabling us to combine yesterday’s problems with today’s trends and what tomorrow will bring in a reliable, productive manner.
In order to achieve accelerated time to market, developers need to focus their efforts on delivering business value- rather than modifying yet another automation script to get their software to build, test and deploy. Execution assumptions need to be extracted from programs and avoided all together or captured in metadata to enable the runtime platform to understand the operational needs so that it can reason and apply execution decisions at machine speed and against the volume of machine-accessible information.
At the intersection between development and operations lives trust. Trust, and that feeling of comfort and safety, originates from knowing that you’ve done all you can and that someone—or something—has your back because you’re only human. The platform and tools we use to develop new software, maintain older software and operate both together harmoniously needs to evaluation and enforce the governance policies laid down by our organization to ensure maximum creativity and velocity within guardrails and with non-invasive oversight and protection.
If we make workloads smaller, focused and simpler whilst capturing operational requirements and governance policies in machine-readable forms that can be reasoned upon then we create the potential for a global compute fabric that embraces all manner of devices and use cases in a safe and secure fashion.
Even in a greenfield situation, Cloud Functions need to be considered as another tool alongside other architecture choices. The development and operational efforts need to be considered to make the right overall decision and not optimize for a local minima. Select platforms and tools that enable choice, choice in programming language, choice in framework, choice in architecture, choice in on-premise, off-premise or mixed infrastructure etc. Above all else, ensure that the platforms and tools you use embrace and facilitate trust and governance within the choices you make.
References
1 Lambda Calculus is a branch of Computer Science and Mathematics that deals with the abstraction of functions and their application to create a universal computation model.
2 We could consider error reporting on not being able to raise the alarm message but getting into the weeds at this point.
3 According to Gartner, " By 2020, More Than Half of New Applications Developed on PaaS Will Be IoT-Centric".
About the Author
Dean Sheehan has more than 25 years experience in the technology industry covering consulting, product development, product management and solution deployment throughout the retail, financial and telecom industries with significant expertise in data centre automation. He currently serves as Director of Solutions Consulting, EMEA for Apcera Apcera provides developers with the tools to simplify the process of software creation and operations teams with the control and insight to enable effective, economic and secure operation of that software.