Key Takeaways
- When selecting a cloud option, it’s important to understand that the level of abstraction that each one provides has a direct impact on the administration cost.
- Consider the company’s ability to manage the infrastructure, including day-to-day management, and how resilient the product is to future changes. Continuous customization may require deploying the product multiple times.
- Decide how much control you want over the infrastructure. High-end dedicated instances will provide maximum control versus Serverless, low-code, and no-code platforms, which offer the least amount of control.
- Determine the amount of customization, major changes, vertical shift, horizontal shift, and new business needs that may arise then select data and application services accordingly.
- Avoid any fixed cost as long as possible. Pick the cheapest and pay-as-you-go subscription and move to better possibilities later.
Gartner’s prediction of cloud adoption investment reaching $482B in 2022 is a weighty indicator of how the cloud is penetrating diverse sectors. But what is alarming is the cloud migration failure rate. At present, it is hovering in between 44% to 57% for businesses across strata, which puts start-ups, with obvious budget constraints, under a lot of pressure.
Software as a Service (SaaS) start-ups are no exception. As a solution architect, I have been designing SaaS applications for years and I have seen start-ups struggle to find the right cloud infrastructure and improve their product offering.
These experiences prompted me to write this article as a tool to help companies make a pragmatic fact and data-driven decision.
Before we jump into the cloud options, it is important to understand the level of abstraction that each one provides as it directly impacts the administration cost. A higher level of abstraction provides less control, lower performance output, and increases cost, but also involves less effort and more utilization.
If you are building a SaaS product, then you will need to purchase and procure the hardware first. Then, install the operating system on top of it, followed by the installation of runtimes like JVM, v8 and Python. After that, you will install all dependencies and finally, deploy your code.
Cloud Infrastructure Options
Every infrastructure option available today abstracts one or two of the following:
Cloud Virtual Machines (IaaS): They primarily abstract the hardware layer, you don’t need to provision anything physical but still have to build other layers. This will give you maximum control but it will take time to set-up Examples are EC2, Azure VMs, Google Cloud Platform (GCP).
Platform as a Service (PaaS): It provides another layer of abstraction over hardware and you don’t need to worry about OS/containers, upgradation, security, etc. Examples are Azure PaaS, AWS Elastic Beanstalk, and GCP PaaS.
Serverless (Function as a Service) (FaaS): This is PaaS with the abstraction of run time. You don’t need to worry about runtime in this one. Major examples are AWS Lamda, Azure Function, and Google Cloud Functions.
Low Code: Along with hardware, OS, and runtime, you also get an abstraction of dependencies management. For example, Parse. You need to put serious thought into best practices.
Kubernetes (K8)(Container Orchestration): If you invest initially in Kubernetes or use any Kubernetes as a service (EKS) when it is production-ready, you will ship your code as pods. From an abstraction perspective, it is similar to Serverless but still provides more control.
Zero Code: There are platforms and services which allow you to create applications without writing any code. However, it doesn’t mean you don’t need developers. It will deliver a fast prototype, MVPs, and initial bootstrap code. For example, Zoho or Quick Base. We are not going to cover zero code platforms.
Now let’s drill down to discuss key factors that can impact the outcome.
7 Factors Impacting the Infrastructure of a SaaS Application
Factor 1: Administration Overhead
The first consideration is the company’s ability to manage the infrastructure, including the time required, whether humans are needed for the day-to-day management, and how resilient the product is to future changes.
If the product is used primarily by enterprises and demands customization, then you may need to deploy the product multiple times, which could mean more effort and time from the infra admins. The deployment can be automated, but the automation process requires the product to be stable. ROI might not be good for an early-stage product. My recommendation in such cases would be to use managed services such as PaaS for infrastructure, managed services for the database/persistent, and FaaS—Serverless architecture for compute.
Factor 2: Time to Market (TTM)
The keys to quick TTM are fast development, testing, and release. And the key to fast development to release is to spend more time in coding and testing than in provisioning and deployments. Low-code and no-code platforms are good to start with. Serverless and FaaS are designed to solve these problems. If your system involves many components, building your own boxes will consume too much time and effort. Similarly, setting up Kubernetes will not make it faster.
PaaS still provides better options than cloud virtual machines, but you may need to build deployment pipelines(CI/CD) to speed up TTM. CI/CD pipelines are available implicitly in low code platforms. You may also want to pick tools that are cloud agnostic and allow you to migrate to other platforms later. There is significant risk with zero and low-code platforms in this regard.
Factor 3: Agility
Product agility is a key factor. You need to consider the amount of customization, major changes, vertical shift, horizontal shift, and new business needs that may arise. Imagine that you are building a multi-tenant system and there are different customization requirements for different tenants. These changes/requests will keep on coming to you. From an infrastructure perspective, you need a system where you don’t have to change them for each request/change. Being cloud agnostic is irrelevant here.
For data, Serverless data services like AWS Aurora or Azure’s Cosmos DB are great choices. If you are building a workflow or data processing, then online services like step functions are the way to go. For applications, Serverless or FaaS is a great choice. You can also build the multitenant system with Kubernetes but it is not a good starting point as you may need to maintain many versions of the application, data, and function. Serverless architecture might be the right starting option.
Factor 4: Control
It’s important to think about how much control you will get over the infrastructure. You would want more control if:
- a) There will be lots of apps, lots of databases, and lots of services.
- b) It is a system where you have to provision hardware for your customer (MongoDB Atlas).
- c) You need to isolate data or runtime or both for your tenants.
- d) It is an online service or API and your USP is to save license, hardware, and administrative cost for your customers.
You would get maximum control with a physical machine or your own rack, but these aren’t used anymore—so the next best thing to maintain a high level of control is high-end dedicated instances. Serverless, low-code, and no-code platforms offer the least amount of control.
Kubernetes will consume lots of time and effort, but from a long-term control perspective, it is a good deal and you have to be 100% cloud agnostic here. Avoid online services as much as possible and remember you are building one.
Factor 5: Cost
Cost is one of the most important factors. Early cost estimates are always difficult, but let’s start with an example:
For 10K requests per hour per day, a Serverless infrastructure will cost you much more than cloud virtual machines. But if the load is heterogeneous and for some random hours it’s 10K and for others it’s 1K, then setting up cloud virtual-machine instances might be costly since they will be underutilized for most of the time and will be of no value during their idle time.
To start with, you will try to avoid any fixed costs as long as possible. But for better utilization, you need to figure out break-even points and switch back to a lower level (low-code to Serverless or Serverless to containerized app). Avoid premature optimization and in the beginning, don’t pursue optimization or balance at all. Pick the cheapest and pay-as-you-go subscription and move to better possibilities later.
Factor 6: Migration
Migration is directly related to cloud agnostic. There are always newer, cheaper, and better cloud offerings that keep coming, so you need to migrate. Sometimes migration depends on which cloud providers your customers want to work with. Just using virtual machines doesn’t make your system cloud agnostic.
For example, if you have different components to access other components and your DevOps team has designed this access management entirely on the IAM role, then migrating from AWS to GCP could be a tough nut to crack. Similarly, if you have to build an entire computation layer on Serverless, then migrating to a virtual machine might not be straightforward.
Factor 7: Integration
If you are building an aggregator platform, then you might be collecting data from third-party APIs scrappers or doing transactions with other APIs for your customers. This is an integration space and as a startup, your primary concern is how fast, reliable, and consistent is your infra.
With integration, you might be spawning multiple spot instances or multiple Serverless instances in a short time to collect/submit data from other APIs to overcome throttling and API rate limits. Serverless is a great help here. Auto-scaled Kubernetes nodes are also good. If you are choosing a cloud virtual-machine instance, then you must spend some time and effort in automating provisioning.
With the infrastructure options available and the factors defined above, I am proposing a Decision matrix that can help you make decisions for your infrastructure.
Decision Framework
I’ve created a framework with the options, factors, and level of difficulty. The ratings that I have used here are purely subjective as it is based on my experience of working with different infrastructures for over a decade and not on benchmarks.
The table that I have created will allow you to understand how difficult it is to achieve (build and set up) a factor using a particular type of infra choice.
- Easy: You can do a simple configuration and you are done. With less effort and time, you can achieve the required factor.
- Medium: You might need to do more configuration/tuning in order to achieve a particular factor, which might not be a straightforward way.
- Hard: To achieve the factor, you may need to invest time and effort using an explicit strategy. You may require certain expertise as well.
Factor | Cloud Virtual Machines | PaaS (GCP, Azure) | Serverless (FaaS) | Kubernetes | Low code |
Administration | Hard | Easy | Easy | Medium | Easy |
Fast Time to Market | Hard | Medium | Easy | Medium | Easy |
Agility | Hard | Hard | Easy | Easy | Easy |
Control | Easy | Hard | Hard | Easy | Hard |
Cost | Easy | Medium | Hard | Easy | Hard |
Migration | Easy | Hard | Hard | Medium | Hard |
Integration | Hard | Hard | Easy | Easy | Easy |
Utilization | Hard | Hard | Easy | Easy | Easy |
The Verdict
For SaaS start-ups, I have realized that it is better to start with Kubernetes (container orchestration) and if Kubernetes is not an option, then Cloud virtual machines should be the next infrastructural option. Kubernetes provides maximum control with minimum effort and ensures cost optimization along with future migration and integration.
You need to keep yourself away from low-code/no-code platforms, they might seem easy to start with but they are minefields for the future, they will not help you with three crucial factors: infrastructure cost, IT administration cost, and licensing cost. PaaS is somewhat acceptable, but it will also provide some blockers if it comes to operating system level upgradation.