Key Takeaways
- Cloud providers shifted focus from infrastructure services to application-first services consumed directly by developers giving rise to a new application architecture.
- This architecture lets developers offload integration logic and management responsibilities into cloud services and focus on implementing the business logic.
- “Cloud bound” represents the evolution of cloud native from addressing compute-centric concerns to managing application integration concerns.
- Cloud-bound applications decouple applications’ internal architecture from their external dependencies using open APIs and de facto standards.
- Cloud-bound applications use open APIs and data formats to bind applications with compute infrastructure and offload integration responsibilities such as stateful orchestration, event-driven interactions, and reliable synchronous interactions into cloud services.
The increasing adoption of application-first cloud services is causing applications to blend with the cloud services at levels much deeper than before. The runtime boundaries between the application and the cloud are shifting from virtual machines to containers and functions. The integration boundaries are shifting from database and message broker access only to one where the mechanical parts of the applications are blended and running within the cloud. In this resulting architecture, applications are “cloud bound” and allow developers to focus on the business logic by offloading more application logic and management responsibilities into cloud services.
This article examines the commoditization of the full software stack by binding the application to cloud services using open APIs and standards that preserve flexibility and portability.
Internal architecture evolution
The internal architecture of an application is typically owned and controlled by a single team. Depending on the language of choice and runtime, tools and abstractions such as packages, modules, interfaces, classes, and functions help developers control the inner boundaries. Domain-driven Design (DDD) assists developers in crafting domain models, which serve as abstractions encapsulating complex business logic and mediating the divide between business reality and code.
Hexagonal, Onion, and Clean architectures can complement DDD and arrange application code with distinct boundaries and externalized infrastructure dependencies. While these approaches were innovative at the time of their inception and remain relevant today, they were initially developed for three-tier Java applications comprising JSPs, Servlets, and EJBs deployed in shared application runtimes. The primary focus then was decoupling application logic from the UI and database and enabling isolated testing.
Figure 1: Internal application architecture
Since then, new challenges and concepts, such as microservices and the twelve-factor app, emerged and influenced how we design applications. Microservices center on separating application logic into independently deployable units owned by a single team. The twelve-factor app methodology aims to create distributed, stateless applications that run and scale in dynamic cloud environments. All of these architectures introduced principles and best practices that shape how we structure an application’s internal architecture and how we manage it.
Figure 2: Application architecture evolution timeline
Later in the application architecture evolution timeline, the mainstream adoption of containers and the introduction of Kubernetes revolutionized the way applications are packaged and orchestrated. AWS Lambda introduced the concept of highly scalable functions as a service (FaaS), taking the idea of application granularity to the next level and offloading the complete infrastructure management responsibilities to the cloud provider. Other technology trends, such as service mesh and Mecha architecture, have also emerged and commoditized non-functional aspects of the application stack, such as networking and distributed developer primitives, respectively, and extracting them into sidecars. Inspired by microservices, Data Mesh architecture aimed to break down the analytical data architecture of applications into smaller, independent data domains, each with its own product and team. These, and more recent trends, such as application-first cloud services, started reshaping applications’ external architecture, which I collectively refer to as “cloud-bound applications” in this article.
External architecture evolution
The external architecture is where an application intersects with other applications and the infrastructure which other teams and organizations in the form of specialized on-premise middleware, storage systems, or cloud services typically own. The way the application connects to external systems and offloads some of its responsibilities forms the external architecture. To benefit from the infrastructure, an application needs to bind with that infrastructure and enforce clean boundaries to preserve its agility. An application’s internal architecture and implementation should be able to change without changing the other one and also be able to swap outer dependencies, such as cloud services, without changing the internals.
Figure 3: External application architecture
Broadly, we can group the way an application binds with its surroundings into two categories.
-
Compute bindings are all the necessary bindings, configurations, APIs, and conventions used to run an application on a compute platform such as Kubernetes, a container service, or even serverless functions (such as AWS Lambda). Mostly, these bindings are transparent to the internal architecture, and configured and used by operations teams rather than developers. The container abstraction is the most widespread “API” for application compute binding today.
-
Integration bindings is a catch-all term for all other bindings to external dependencies that an application is relying upon. The cloud services also use these bindings to interact with the application, usually over well-defined HTTP “APIs,” or specialized messaging and storage access protocols, such as AWS S3, Apache Kafka, Redis APIs, etc. The integration bindings are not as transparent as the runtime bindings. The developers need to implement additional logic around them, such as retry, TTL, delay, dead-letter queue (DLQ), etc., and bind these to the application’s business logic.
Applications run on the cloud and consume other services by using these bindings. Let’s see in more detail what exactly is behind these bindings and what is not.
Compute bindings
For the operations teams, ideally, each application is a unit of a black box that needs to be operated on the compute platform. The compute bindings are used to manage an application’s lifecycle on platforms such as Kubernetes, AWS Lambda, and other services. These bindings are formalized and defined in the form of a collection of configurations, and API interactions between the application and the platform running the application. Most of these interactions are transparent to the application, and there is only a handful of APIs that developers need to implement, such as the health endpoints and metrics APIs. This is how far the current CNCF definition and scope of “cloud native” extends, and as long as developers implement cloud-native applications, they can bind and run on a cloud compute platform.
Figure 4: Application and platform compute bindings
To run on a cloud platform reliably, the application has to bind with it on multiple levels, ranging from specifications to best practices. This happens through a collection of industry-standard specifications such as container APIs, metrics APIs, for example, based on Prometheus, health endpoints, or cloud vendor specifications, such as AWS Lambda or AWS ECS specifications. Also, through cloud-native best techniques and shared knowledge such as health checks, deployment strategies, and placement policies. Let’s see the common compute bindings used today.
Resource demands
Applications, including microservices and functions, require resources, such as CPU, memory, and storage. These resources are defined differently depending on the platform being used. For example, on Kubernetes, CPU and memory are defined through requests and limits, while on AWS Lambda, the user specifies the amount of memory to allocate at runtime, with a corresponding allocation of CPU. Storage is also handled differently on these platforms, with Kubernetes using ephemeral storage and volumes, while Lambda offers ephemeral scratch resources and durable storage based on Amazon EFS mounts.
Lifecycle hooks
Applications managed by a platform often need to be aware of important lifecycle events. For example, on Kubernetes, concepts such as init containers and hooks like PostStart and PreStop allow the application to react to these events. Similarly, Lambda’s extension API allows the application to intercept the Init, Invoke, and Shutdown phases. Other options for handling lifecycle events include wrapper scripts or language-specific runtime modification options, such as a shutdown hook for the JVM. These mechanisms form a contract between the platform and the application, enabling it to respond to and manage its own lifecycle.
Health checks
Health probes are a way for the platform to monitor the health of an application and take corrective action if necessary, such as restarting the application. While Lambda functions do not have health probes due to the request’s short lifespan, containerized applications and orchestrators like Kubernetes, AWS EKS, and GCP Cloud Run do include health probes in their definitions. This allows the platform to ensure that the application runs smoothly and take action if not.
Deployment and placement policies
With knowledge of the required resources, the compute platform can begin managing the application’s lifecycle. To do so in a manner that does not compromise the integrity of the business logic, the platform must be aware of the scaling constraints. Some applications are intended to be singletons. For example, they need to maintain the order of processed events and cannot be scaled beyond one instance. Other stateful applications may be quorum-driven and require a specific number of minimum instances constantly running to function properly. And still others, such as stateless functions, may favor rapid scaling to address increasing spikes in the load. Once the scaling guidelines for the application have been established, the platform assumes control of initiating and terminating instances of the application.
Compute platforms also offer a variety of deployment strategies, including rolling, blue-green, canary, and at once, to control the sequence of updates for a service. In addition to the deployment sequence, these platforms may allow the user to specify placement preferences. For example, Kubernetes offers options such as labels, taints and tolerations, affinity, and anti-affinity, while Lambda allows users to choose between regional and edge placement types. These preferences ensure the application is deployed and aligned with compliance and performance requirements.
Network traffic
Directing low-level network traffic to service instances is also a responsibility of the compute platform. This is because it is responsible for deployment sequencing, placement, and autoscaling, which all impact how traffic is directed to the service instances. Health checks can also play a role in traffic management, such as the readiness check in GCP Cloud Run and Kubernetes. By handling these tasks, the compute platform helps to ensure that traffic is efficiently and effectively routed to the appropriate service instances.
Monitoring and reporting
Any compute platform for distributed applications must provide deep application insights in the form of logs, metrics, and tracing. And today, there are a few widely accepted de facto standards in this space: logs are ideally in a structured format such as JSON or other industry-specific standards. The compute platform typically collects logs or provides extension points for specialized log draining and analyzing services to access the logs. That can be a DaemonSet on Kubernetes, a Lambda partner extension for monitoring, or a Vercel edge function log Drainer. The compute platform must support the collection and analysis of metrics and tracing data in order to provide comprehensive insights into the performance and behavior of a distributed application. There are several industry-standard formats and tools for handling this data, such as Prometheus for metrics and OpenTelemetry (OTEL) for tracing. The compute platform may offer built-in tools for collecting and analyzing this data or provide extension points for specialized services to access the data. Regardless of the granularity of the code (microservice or function) or the location (edge or not), the compute platform should allow for the capture and export of logs, metrics, and tracing data to other best-of-breed cloud services such as Honeycomb, DataDog, Grafana to name a few.
Compute binding trends
Compute bindings are language and application runtime agnostic and are primarily used by the operations teams for managing applications at runtime rather than developers implementing them.
While the size and complexity of applications can vary from monoliths to functions, they are typically packaged in containers with health check endpoints, lifecycle hooks implemented, and metrics exposed. Understanding these compute bindings will help you to effectively use any container-based compute platform, whether it is an on-premises Kubernetes cluster, a managed container service such as AWS ECS, Google Cloud Run, Azure Container Apps, or function-based runtimes such as AWS Lambda, GCP Functions, or edge runtimes such as Vercel edge function, CloudFlare workers, or Netlify edge functions, etc. Using open and de facto standard APIs will help you create not only portable applications, but also limit vendor lock-in by using operational practices, and tools that are portable across cloud vendors and service providers.
Integration bindings
Integration bindings, on the other hand, are meant to be used by developers and not operations teams. They are centered around common distributed systems’ implementation areas such as service invocation, event-driven interactions, task scheduling, and stateful workflow orchestration. They help connect the application with specialized storage systems and external systems through cloud-based middleware-like services, which I collectively refer to in this article as the integration cloud. In the same way containers provide compute abstractions, integration cloud services provide language-agnostic integration abstractions as a service. These primitives are independent of the use case, application implementation, runtime, and compute environment. For example, the Retry pattern, the DLQ pattern, the Saga pattern, service discovery, and circuit breaker patterns can all be consumed as a service from the integration cloud.
Figure 5: Application and platform integration bindings
Today, a pure integration cloud with all the main patterns exposed as standalone features does not exist yet. Early cloud services are offering some of these integration primitives as features of storage systems such as Kafka, Redis, and the like, but these features can be rarely used on their own or combined with others. Notable exceptions here are services such as AWS EventBridge and Azure Event Grid, which you can use with multiple cloud services from the same vendor, but no other vendors directly. This is a fast-evolving space with some good examples and some gaps that are not filled yet, but I believe they will be in the future. To work, the application must bind with the integration cloud services and offload some of these developer responsibilities. Here are the main types of integration cloud services and binding aspects.
Integration demands
In the same way an application can demand resources and express deployment and placement preferences to the compute platform, the application can also demand and activate specific integration bindings. These bindings can be activated through configurations passed to the platform declaratively or activated at runtime through programmatic interactions. For example, applications can subscribe to pub/sub topics using declarative and programmatic subscriptions. An AWS Lambda function can subscribe to an event source declaratively through configurations or programmatically through a client library or SDK by asking the integration platform for specific binding to be registered or unregistered. Applications can subscribe to cron job triggers, activate a connector to external systems, make configuration changes, etc., all running on the integration cloud.
Workflow orchestration
A persistent service orchestration logic is a very common necessity and a prime candidate for externalizing and consuming it as a service. As a result, workflow orchestration is among the best-known integration binding types today. Among the common uses of this service are implementations of the Saga pattern for service and business process orchestration, function orchestration with AWS Step Functions, Google Stateful Functions, Azure Durable Functions, task distribution with Google Workflow, and many other services. When such a binding is used, part of the application orchestration state and logic is offloaded into another service. While the application services have internal state and logic to manage that state, other parts are on the outside, potentially in some other cloud service. This represents a shift in how applications are designed and operated as a single self-contained unit today. Future applications will have not only data on the outside but integration on the outside too. With the increasing adoption of integration cloud, more integration data and logic will start living on the outside.
Temporal triggers
Temporal binding represents a time-bound specialization of orchestration binding. It has one single goal, to trigger various services at specific times based on the given policy. Examples in this category are AWS EventBridge Scheduler, Google Cloud Scheduler, Upstash Qstack service, and many others.
Event-driven and messaging services
These bindings act as an event store to offload requests and decouple applications, but they are increasingly not limited to storage and expanding towards providing message processing patterns. They provide developer primitives on top of the event store, such as dead-letter queue, retries, delayed delivery, and message processing patterns such as filtering, aggregation, reordering, content-based routing, wiretap, etc. Examples of this binding would be things such as Confluent Cloud kSQL, AWS EventBridge, Decodable data pipelines, and others.
External connectors
These bindings help connect to external systems. They also perform data normalization, error handling, protocol conversion, and data transformation. Examples are Knative source importers, AWS EventBridge connectors, Confluent Cloud connectors, Decodable Kafka connectors, AWS Lambda source and destinations.
Health checks
Health checks are essential in compute bindings where a failing health check usually causes the application to restart. Integration bindings also need health checks but for a different purpose: an integration health check does not influence the application runtime, but it tells the integration cloud whether the application is capable of handling integration-driven interactions or not. A failing integration health check can stop the integration binding until the application is back and healthy when the integration bindings are resumed. Very often, you can use the same application endpoints for compute and integration binding checks. A good example is Dapr’s application health-check that can temporarily stop consumers and connectors from pushing data into an unhealthy application.
Other bindings
There are even more bindings that are coming up and fall under the category of integration bindings. For example, feeding an application with introspection data such as Kubernetes Downward API and Lambda environment variables that provides a simple mechanism for introspection and metadata injection to applications. Configuration and secret bindings where secrets are not only injected into the application at startup time, but any configuration updates are pushed to the application with sidecars such as Hashicorp Vault Sidecar Injector or Dapr’s configuration API, service binding specification for Kubernetes. And less common patterns, such as a distributed lock, which is also a type of integration binding that can provide a mutually exclusive access to a shared resource.
Integration binding trends
Containers are becoming the most popular and widely used portable format for packaging and running applications, regardless of whether they are long-running microservices or short-lived functions. Integration bindings, on the other hand, can be grouped into distinct problem areas, such as event-driven interactions, stateful orchestration, and state access, and vary in terms of underlying storage and usage patterns. For example, Apache Kafka is a de facto standard for event logs, AWS S3 API is for document access, Redis is for key-value caching, PostgreSQL is for relational data access, etc. What makes them standards is the growing ecosystems of libraries, tools, and services built around them, giving assurance about the substantial degree of maturity, stability, and future backward compatibility. But these APIs alone are limited to storage access aspects only and often require developers to address distributed system challenges within the application code. Aligned with the software commoditization direction that goes up the stack, the integration bindings are becoming available as a service. There is a growing number of serverless cloud services that offer additional integration capabilities that the application code can bind to in addition to data access.
In this model, a cloud-bound application typically runs on a serverless compute infrastructure, following the cloud native primitives. It binds with other serverless cloud services for service orchestration, event-processing, or synchronous interactions, as shown below.
Figure 6: Cloud-bound applications ecosystem
One project that unites most integration bindings and developer concerns into an open-source API is Dapr from CNCF. It offers synchronous service invocation, stateful service orchestration, asynchronous event-driven interactions, and technology-specific connectors as APIs. Similar to how containers and Kubernetes act as a compute abstraction, Dapr acts as an abstraction for external services. Dapr also offers integration features independent of the underlying cloud services and very often have to be implemented in the application layer, such as resiliency policies, dead-letter queues, delayed delivery, tracing, fine-grained authorization, and others. Dapr is designed to be polyglot and run outside of the application, making it easy to swap external dependencies without changing the internal architecture of an application as described in the hexagonal architecture. While Dapr is used primarily by developers implementing applications, once introduced, Dapr enhances the reliability and visibility of distributed applications, offering holistic benefits to operations and architect teams. To learn more about this topic, join me in person or virtually at QConLondon later this year, where I will speak about “How Application-First Cloud Services Are Changing the Game.”
Post-cloud-native applications
Cloud-bound applications represent cloud-native progression from addressing compute-only concerns to managing the application layer needs. This trend is accelerated by the expansion of cloud services up the application stack from infrastructure to application-first services. We can observe this transition in the explosion of developer-centric cloud services for stateful orchestration, event-driven application infrastructure, synchronous interactions, cloud-based development and test environments, and serverless runtimes. This move to application-first cloud services is giving rise to a new application architecture where more and more application logic is running within cloud services. This blending of applications with 3rd-party cloud services allows developers to offload more responsibilities, however, it can limit the flexibility and agility needed by changing business needs. To preserve the independence of an application’s internal and external architectures, applications and cloud services need to be decoupled with clean boundaries at development time and deeply bound together at runtime using well-defined open APIs and formats. In the same way containers and Kubernetes have provided open APIs for compute, we need open APIs for application integration abstractions. This will enable the portability and reuse of operational practices and tools and development patterns, capabilities, and practices.