Key Takeaways
- Serverless Functions can be a great deployment model for (Micro)services since they offer the fastest path to production, the lowest entry cost, and high elasticity.
- When speed to reach production with a new service is important, when we want to minimise upfront investments, when load curves are unknown, Serverless Functions are the deployment model to choose.
- But things can change over time, loads can become predictable and stable, and if this happens then Serverless Functions can turn out to be much more expensive than traditional deployment models based on dedicated resources.
- It is therefore important to design our applications so that switching deployment models is as cheap as possible and this can be achieved by enforcing strict separation between what is dependent on the actual target platform from what is common and independent.
- If we enforce such separation, not only in the application code but also in the DevSecOps part, we can keep the required flexibility and we can improve the overall efficiency of the development cycle.
While designing and planning for a new (micro)services based architecture, there are moments when architects have to think about a deployment strategy and therefore ask themselves; “Shall we deploy this (micro)service as a serverless function or is it better to place it in a container? Or maybe on a dedicated virtual machine (VM)?”
As often happens these days there is only one answer: “It depends”. But not only “it depends”, “it can also change”. The optimal solution of today can become very suboptimal a few months down the line. At the launch of a new app it may be convenient to opt for Serverless Functions, since they are fast to set up and require very little upfront investment.
Plus their “pay per use” model is very attractive when we are not sure about the load the app will have to sustain. Later on, when the service matures and the load becomes predictable, it may be convenient to move to a more traditional deployment topology, based on containers or dedicated servers.
When designing new systems for the cloud era, it is therefore important to leave the freedom to change deployment strategy of the various (micro)services with the least possible cost. Being on the right infrastructure at the right time can bring significant savings on the cloud bills.
Serverless Functions
Serverless functions (also known as FaaS, functions as a service) are units of logic that get instantiated and executed in response to configured events, like HTTP requests or messages received in a Kafka topic. Once the execution completes, such functions disappear, at least logically, and their cost goes to zero.
FaaS responding to HTTP events: multiple parallel executions and “pay per use” model
All major public clouds have FaaS offerings (AWS Lambda, Azure Functions, Google Functions, Oracle Cloud Functions) and FaaS can also be made available on premise with frameworks like Apache OpenWhisk. They have some limitations in terms of resources (for instance, on AWS, a maximum of 10GB memory and 15 minutes execution time) but can support many use cases of modern applications.
The granularity of a serverless function can vary. From a single, very focused responsibility, for instance generating an SMS message when an order is confirmed, to a full microservice which can live side by side with other microservices implemented on containers or running on virtual machines (Sam Newman details the FaaS-Microservices relation in this interesting talk).
A front end application served by microservices implemented with different deployment models
Advantages of FaaS
When serverless functions are idle they cost nothing (“Pay per use” model). If a serverless function is called by 10 clients at the same time, 10 instances of it are spun up almost immediately (at least in most cases). The entire provision of infrastructure, its management, high availability (at least up to a certain level) and scaling (from 0 to the limits defined by the client) are provided out of the box by teams of specialists working behind the scenes.
Serverless functions provide elasticity on steroids and allow you to focus on what is differentiating for your business.
FaaS therefore means two great advantages
- Elasticity on steroids: scale up & down on demand and pay only for what is used.
- Focus on what is differentiating for your business: concentrate forces on the development of the most critical applications, without having to disperse precious energies in complex fields such as modern infrastructures, which FaaS offers as a commodity.
Elasticity and costs
A “good service” means, among other things, a consistent good response time. Consistency is the key here: it has to be good when the application is under normal load as well as at peak.
A “new service” needs to go out fast to the market, with the lowest possible upfront investment, and needs to be a “good service” since the start.
When we want to launch a new service, a FaaS model is likely the best choice. Serverless functions can be set up fast and minimise the work for infrastructure. Their “pay per use” model means no upfront investment. Their scaling capabilities provide good consistent response time under different load conditions.
If, after some time, the load becomes more stable and predictable, then the story can change, and a more traditional model based on dedicated resources, whether Kubernetes clusters or VMs, can become more convenient than FaaS.
Different load profiles may cause a dramatic variance in cloud bills when comparing FaaS with solutions based on dedicated resources.
But how big can be the difference in costs? As always, it depends on the specific scenario, but what is certain is that the wrong choice can have a big impact on the cloud bill.
Serverless Functions can SAVE BIG money, but they can also COST BIG money
Estimating Cloud costs is becoming a science in itself, but we can still get a feeling of the potential benefits of one model over the other using the AWS price list at the time of writing.
The sample application. Let’s consider an application that receives 3.000.000 requests per month. Each request takes 500 msec to be processed by a Lambda with 4 GB memory (the CPU gets assigned automatically based on memory).
FaaS has a “pay per use” model and so, regardless of load curve (whether with peaks or flat), the cost per month is fixed: 100.60 USD.
On the other side, if we consider a model based on dedicated VMs things are different and costs depend heavily on the shape of the load curve.
A scenario with load peaks. If the load is characterised by peaks and we want to guarantee a consistent good response time for our clients, we need to size the infrastructure to sustain the peak. If at peak we have 10 concurrent requests per second (which is well possible if the 3.000.000 requests are concentrated in certain hours of the day or in certain days like at month end) it might well be possible that we need a VM (AWS EC2) with 8 CPUs and 32 GB memory to provide the same performance as Lambda. In this case the monthly cost would jump to 197.22 USD (savings can be obtained with multi year commitment, but this reduces financial flexibility). Costs have basically doubled. This difference could be mitigated by dynamically switching on and off the EC2 instances according to the load, but this requires that the load is foreseeable and increases the complexity and cost of the solution.
A scenario with flat load. If the load is basically flat we are in a different territory. If there are no peaks, then we can easily sustain the load with a much smaller machine. Probably a VM with 2 cpus and 8MB memory would suffice and the monthly costs in this case would be 31.73 USD, less than a third of the cost of Lambda.
A realistic business case is much more complex and needs a thorough analysis, but just looking at these simplified scenarios, it appears clear that a FaaS model can be very attractive in some cases but can become a real burden if constraints change. So it is important to have the flexibility to change the deployment model accordingly.
The question becomes then: How can we achieve such flexibility? How difficult is it going to be?
The anatomy of the codebase of a modern application
When using modern development practices, the codebase of an application usually ends up being split in logical areas
- Application Logic. The code (usually written in languages like Java, TypeScript or Go), which implements what the application has to do
- DevSecOps (CI/CD). Typically scripts and configuration files that automate the build, test, security checks and deployment of the application
Application logic
We can further divide the application logic of a back end service in logical parts
- Business Logic. The code that implements the behaviour of the service, expressed in form of logical APIs (methods or functions) which typically expect some data, maybe as JSON, as input and return some data as output. This code does not have any dependency on the technical mechanics linked to the actual environment where it runs, being this a container, a serverless function or an application server. When in a container, these logical APIs can be invoked by the likes of Spring Boot (if Java is the language), Express (with Node) or Gorilla (with Go). When invoked in a serverless function, it will use the specific FaaS mechanism implemented by the specific cloud provider.
- Deployment related code. The code that deals with the mechanics of the run environment. If more than one deployment model has to be supported, there must be different implementations of this part (but only this part). In case of container-based deployment, this is the part where the dependencies on the likes of Spring Boot, Express or Gorilla are concentrated. In the case of FaaS, this part will contain the code which implements the mechanics defined by the specific Cloud provider (AWS Lambda, Azure functions or Google Cloud functions which have their own proprietary libraries to invoke the business logic).
Separation of deployment related code from common business logic to support switching among different deployment models
To grant flexibility in the deployment strategy at the least possible cost, it is crucial to keep these two parts clearly separated, which means:
- The “deployment related code” imports modules from the “business logic”
- The “business logic” never imports any module/package which depends on a specific runtime.
By following these two simple rules, we maximise the amount of (business logic) code shared among all deployment models, and therefore minimise the cost of moving from one model to another.
It is impossible to estimate in abstract the relative sizes of the “business logic” and “deployment related code” parts. Analysing one simple interactive game deployable both on AWS Lambda and Google Application Engine, it turns out that the “deployment related code” weights 6% of the codebase (about 7.200 lines of code total). So 94% of the codebase is the same, regardless of whether the service runs in a Lambda or in a container.
DevSecOps (CI/CD)
This is the part of the codebase responsible for automating the build, test (including the security gateways) and deployment of the application.
The build and deploy phases, for their nature, are heavily dependent on the actual runtime environment chosen for the execution of the service. If we opt for containers, then the build phase will likely use Docker tools. If we choose a FaaS provider, there is no build phase as such, but there will be commands to download the code to the FaaS engine and instruct it on the entry point to call when the serverless function is invoked. Similarly there may be security checks that are specific to a FaaS model.
At the same time, if we enforce neat separation between “business logic” and “deployment related code” we can reap great benefits when it comes to testing. If “business logic” and “deployment related code” are separate, unit and, most importantly, integration tests can be run independently from the final deployment model, choosing the simplest configuration and running the test suites against it. Tests can even be run on developers’ workstations, greatly increasing speed of test feedback.
There is still the need for some test steps to happen after the actual deployment is completed, but the bulk of the test work, which is the test of the correctness of the “business logic”, can be performed independent from the actual deployment, significantly increasing developer productivity.
Strive to isolate what is not dependent on the deployment model and maximise it
If we want to be flexible in the choice of the deployment model, we have to keep clear boundaries between what pertains to “deployment related code” and what is independent. This means to design our solutions up front to minimise the former and maximise the latter, and enforce a strong division of these different parts during the lifetime of the project/product, maybe using regular design and code reviews.
Split of concerns among parts of a modern application codebase
Conclusions
Serverless Functions represent a great option in modern architectures. They offer the fastest time to market, the lowest entry cost, and the highest elasticity. For these reasons they are the best choice when we need to ship out a new product or we have to face highly variable loads.
But things can change over time. Loads can become more stable and predictable, and in such circumstances, FaaS models can turn out to be much more expensive than more traditional models based on dedicated resources.
It is therefore important to keep the flexibility to change deployment models of applications at the lowest possible cost. This can be achieved by enforcing a clear separation between what is the core of the application logic, which is independent of the platform it runs on, and what depends on the specific deployment model chosen.
Following a few simple architectural guidelines from the start, and incorporating regular peer review processes are the way we can keep this flexibility.
About the Author
Enrico Piccinin is a man with passion for code and for some strange things that sometimes happen in IT organizations. With many years of experience in the space of IT development, he likes to see what happens when the “new IT” is applied to traditional organizations. Enrico can be found at enricopiccinin.com or LinkedIn. Views and thoughts here are my own.