There is currently a hype in the adoption of event-driven systems. Sometimes they are almost seen as the "magic thing" in our strive for decoupled systems, Bernd Rücker, noted at the QCon London 2019 conference. In his presentation he took a critical look at three common hypotheses around event-driven systems: events decrease coupling, orchestration needs to be avoided, and workflow engines are painful.
Events decrease coupling
Rücker, co-founder of Camunda, the company behind the Camunda Workflow Engine, agrees that events can decrease coupling, and he uses a notification service as a good example. This service has a clear responsibility for dealing with all notifications, for instance to notify customers about shipped orders. Besides listening to events, it is independent from other services. It also removes the need for notification handling in all other services.
But with complex event flows within a peer-to-peer event chain, you are losing the overview of the flow of events through the system. With events corresponding to a placed order — order placed, payment received, goods fetched, goods shipped — and published by the relevant service, it can be hard to get a picture of the overall flow from a business perspective. Rücker refers to an article by Martin Fowler where he points out that although the event notification pattern can be useful, it also adds a risk of losing sight of the larger-scale flow.
One approach to regain a view of the event flows is using monitoring and tracing, and in an article on InfoQ, Rücker describes examples of how this can be done:
- Distributed tracing
- Data lake or event monitoring
- Process mining
- Process tracking
A potential problem with a peer-to-peer event chain is when the work flow needs to change — for instance goods are fetched before payment to optimize delivery times — and then a couple of services will need a change in their event subscriptions. This also requires a coordination between teams and in deployment of the services, as well as a consideration of ongoing orders and active events in the system. Coordination between microservices teams is something we want to avoid, and Rücker refers to Eric Evans who has compared this coordination with a three-legged race — velocity decreases and the risk of falling down increases.
Orchestration needs to be avoided
To ensure that business processes are fulfilled, and using as a customer order in his example, Rücker prefers to extract this end-to-end responsibility into one service. One advantage is that you will have one service being responsible for something that is very important for a company, and for him that makes total sense. You will have one single point where you can control the sequence of things — the workflow lives inside the service boundary. This also gives the possibility to start using commands, for example "retrieve payment", to control the workflow. Commands can help in avoiding potentially complex peer-to-peer event chains, and he emphasizes that this is not a move to REST or something similar, it’s still messaging.
Commands are orchestration — you tell someone to do something — but Rücker points out that the orchestration is an internal part of a microservice; it’s not some external ESB-like middleware. A risk with orchestration is ending up with a few smart services that tell anaemic CRUD services what to do, and he refers to Sam Newman and his book Building Microservices. Rücker agrees that this is a risk but notes that it’s not something that automatically will happen due to orchestration. For him it only happens when you have a bad API design and uses a payment service as an example. The service must be responsible for everything regarding a payment and only returns payment received or failed, not internal problems like temporarily communication problems with external services.
Workflow engines are painful
The statement that workflow engines are painful is for Rücker not true anymore. There are, or have been, complex, proprietary and central solutions out-of-the-box, but new tools are emerging that are relevant in modern architectures. They are also lightweight, which for Rücker means that they can be started with a few lines line of code, and that workflows can be defined in code, for example in a Java DSL. The big cloud vendors have their own products, like: AWS Step Functions, Azure Durable Functions and Google Cloud Composer. There are also lightweight open source workflow engines like Activiti, Camunda, jBPM and Zeebe.
Rücker has published a sample application demonstrating his ideas in an order fulfilment system based on independent components (microservices) and Kafka. Most presentations at the conference were recorded and will be available on InfoQ over the coming months. The next QCon conference, QCon.ai, will focus on AI and machine learning and is scheduled for April 15 – 17, 2019, in San Francisco. QCon London 2020 is scheduled for March 2 - 6, 2020.