Key Takeaways
- A new breed of integration software is arising that syncs business data into a simplified data hub and then syncs that data to the destination system.
- The benefit of this integration pattern is that it reduces the number of manual transformations required (often to zero) and makes it easier to write manual transformations when you have to.
- The two risks that they must overcome relate to the security of the data and the breath of scenarios the simplified data format suits.
- The new breed of companies face competition from existing integration companies, cloud platforms, data lake companies such as Snowflake and, potentially, RPA companies such as AutomationAnywhere and UIPath.
- Regardless of whether this new breed of integration company can overcome the risks and competition, we see this integration pattern having a lasting impact on how companies integrate.
Ask any primary school student how to tackle an algebra problem and they’ll tell you to start by simplifying it. With that in mind, it seems bizarre that for most data integration problems we start by making the data more complex - we put it into a standards-based XML format.
A new breed of integration tools from companies such as Merge.dev, Codat.io and Stedi.com is bucking that trend, and I’m a big fan of their approach. These integration tools have simplified data models for most common business data objects (suppliers, customers, employees, invoices etc) and have connectors that take data from many popular software packages (or EDI, Electronic Data Interchange, formats in Stedi’s case) and convert it to that simplified model.
As an integrator, this makes your job easier because you don't need to learn the arcane idiosyncrasies of each system. You can focus on transforming data from one simplified format to another.
The key difference between this new breed of integration companies and those who have come before them is that, instead of just being a pipeline from one system to another, they are actually transforming and storing your data in a simplified format. You can think of this not as a pipeline but as a hub that syncs from the source system to the simplified data hub and another sync from the simplified data hub to the destination system.
This "hub" difference dramatically changes the economics of integration. To compare it to the efficiency benefits of a hub-and-spoke airport model vs a point-to-point airport model actually undersells the benefits of a hub-and-spoke integration approach.
Merge.dev and Codat refer to themselves as Universal or Unified APIs and Stedi sees itself as tooling to build a universal API but I’m going to refer to them as Business Integration Hubs in this article to emphasise the connection between their approach and the efficiencies created by the approach.
This article examines the emergence of Business Integration Hubs, the benefits and risks of the approach, takes a quick look under the hood of three of them, and assesses the competitive landscape and the future of this approach.
Four phases of integration
We’re currently in what I view as the fourth phase of integration tools.
The phases (which overlap each other considerably) are:
- Direct integration
- Bespoke pipelines
- API / Connectors
- Transformation
In the first phase (Direct Integration), you connected internal systems by writing SQL queries directly against databases and you integrated with your customer and supplier systems using XML or text delimited standards such as cXML, X12 EDI and OASIS. Integration projects could only be justified for systems with lots of transactions and considerable, ongoing manual effort.
The next phase (Bespoke pipelines) saw the rise of companies like Snaplogic, Jitterbit, Talend and others that made it possible to push and pull data from legacy systems by writing bespoke integration pipelines.
From here, two parallel paths emerged - both relating to APIs. On the one hand, enterprises began exposing API endpoints on hosted systems using software such as Mulesoft. On the other hand, companies such as Zapier, Tray.io and n8n.io started putting API endpoints on SaaS software.
We’re now in what I call the Transformation phase, where the companies involved in the previous phases are building out their transformation capabilities to speed up integration. No longer are authentication and connectors the most time-consuming part of integrating systems - the most time-consuming part is transforming data from one system into a format suitable to go into another system. And companies such as tray.io and n8n.io are building really nice data transformation capabilities to make it quicker than ever before to connect two systems together.
Business Integration Hubs are taking this one step further by making it so, for many integration tasks, you don’t have to do any data transformation. And even when you do have to write the transformations yourself, you are starting from a simplified data format and transforming it to another simplified format.
Stedi.com is perhaps the best example of this. As you may have guessed, Stedi is an EDI data transformation system. They have built a tool to convert EDI data into JSON documents and back again. This is a pretty incredible feat because EDI data formats are mind-numbingly dull to look at and brain-splittingly hard to work with. So, instead of wasting your life learning about how the segments in each type of EDI document work, you can just start writing transformations using their tools, or your preferred programming language.
"At Stedi, we give developers the tools to define their own schemas that fit within (and sometimes outside of) the confines of various EDI standards. For example, Stedi Guides enables developers to define their own "opinion" of an X12 EDI 810 (Invoice) and build their own integration based on that structure. Now, users can build systems that can communicate with virtually any trading partner on the same EDI standard, regardless of what system or API they are using internally."
- David Kanter, Customer Operations, Stedi.com
Merge.dev and Codat are tackling a different part of the problem. They have defined JSON schemas of most of the common financial documents and set up a nice workflow for connecting to different source systems. With this work done, they are now building out their connectors to cover as many systems as possible.
Benefits
The big benefit of using a Business Integration Hub is that once you connect your transactional systems to the hub it becomes trivial to integrate with several systems. So, if you use Netsuite as your Finance suite, Salesforce as your CRM and BambooHR as your HR system, you can connect all of these to your hub and exchange data easily between them. Of course, all of this can be done with a point-to-point integration tool like Tray.io or n8n.io but the promise of all of the data being stored in common simplified models will make this easier.
To better understand the benefits, let’s look in some detail at the economic benefit of the hub-and-spoke airport model. The airline industry has largely organised itself around hubs. When you fly from Nashville, Tennessee to Albuquerque, New Mexico, you can’t catch a direct flight. You have to go through the Dallas hub (with American Airlines) or the Denver hub (with United Airlines). This is because there is not enough passenger traffic from Nashville to Albuquerque to make it financially viable for an airline to set up and operate such a route. But when they can bring all of the Albuquerque-bound passengers from lots of cities into their hub they can easily offer several flights per day between the hub and Albuquerque. Integration works in exactly the same way. It is time-consuming to build good pipelines between systems. There is always going to be an economic advantage in using a hub.
Where the analogy between integration hubs and airport hubs breaks down, it breaks down to the benefit of integration hubs. When you are flying, you always want to take the route with the fewest stop-overs. In the integration world, it doesn’t matter so much. Whether you transform data directly from a source format to the destination format or use a simplified representation in between is often a matter of personal preference. As someone who does lots of integrations, I typically find it easier to transform data from a source format to a simpler format and then from that simpler format to the destination format than doing the transformation directly.
The hardest part of integrating two systems is mapping and transforming the data. The key to solving this problem is standardization, and that's why we've spent years refining our data models in some of the most complex domains of financial data - accounting and commerce. That's where our users really perceive the value and quality of what we offer.
Dave Hoare, CTO & Co-Founder, Codat
Risks
But setting out the benefits like this highlights two significant risks:
- All of your data is now in one place. Is it safe?
- Your data is being stored in simplified models. Are these sufficient for your needs?
If Business Integration Hubs are going to be used by medium to large enterprises they are going to need to absolutely nail data governance and security. Here’s what Merge.dev has to say on the topic:
"Keeping data safe is no simple feat. We're obligated to prove to our customers that we not only protect their data, but that we also comply with varying regulations around the world. Merge encrypts data at rest and in transit, and then encrypts it again using keys stored in external services. We also store EU and US data in completely isolated environments to ensure our customers are able to comply with the GDPR."
– Gil Feig, Co-Founder @ Merge.dev
Competition
The Hub approach makes a lot of sense but this may not mean that the early players end up as winners. They will face competition from the following types of companies:
- Existing integration companies such as tray.io and n8n.
- Cloud platforms such as Microsoft Azure, Google Cloud Platform and AWS.
- Data lake companies such as Snowflake, and
- RPA companies such as AutomationAnywhere and UIPath.
Is this a winner-takes-all category?
We don’t see integration as a winner-takes-all category where there will be one single successful Hub that every business will have to be a part of. Certainly, there are network effects at work within each Hub, but they are weak. For example, if several of your trading partners use a particular Hub it’ll be easier to integrate with them if you use the same Hub. But, inevitably, there will be very good pathways between Hub competitors, so it probably won’t matter much.
For example, it is easy to imagine a future where a hardware store chain in the US adopts Stedi as their EDI platform. If Supplier 1 uses the Codat hub they will connect Codat to Stedi. If Supplier 2 uses the Merge.dev hub it will be just as easy for them to connect to Stedi. And, if suppliers 1 & 2 want to do some joint marketing together, they will be able to sync their CRM data between Codat and Merge.dev.
Competition from existing integration providers
Existing integration companies such as Tray.io and n8n.io and will be the first competitors faced by the Hubs (Although the Hubs handle what is essentially a subset of the integration problems handled by Tray or n8n.io). It’s almost a philosophical difference between them. Is it better to do point-to-point integrations between different systems or is it better to sync your data to an Hub and utilise their existing connections?
Cloud platform competitors
Every SaaS company needs to assess how the big cloud providers (Microsoft Azure, Google Cloud Platform and AWS) will react if they are successful.
With the possible exception of Microsoft, we don’t see the big three cloud platform players being significant players in this space. It is a bit too far up the stack to be serviced well by them. You can see this by looking at Azure’s Dataverse and AWS’s Appflow.
MS Dataverse is a thin veneer over a SQL Server database that defines commonly used tables such as suppliers, customers, invoices and bills. I was very excited when this first came out but, in practice, it is not much easier to use than actually writing SQL queries.
AWS Appflow is the start of an integration service by AWS. It looks like it is going to be a very robust pipeline between SaaS systems. But it doesn’t impose any schema on the integrations. Instead, this is left to the end user. Which is great for some scenarios but not for the scenarios that Integration Hubs are tackling.
GCP’s approach confuses me (Trifacta, Cloud Composer, Workflows, AA etc) so I’ll leave it to the reader to explain it to me :)
Snowflake
Snowflake is in a great position to put together a really capable Hub. Until recently, Snowflake has focussed on analytical data rather than operational data, but their recent Unistore announcement indicates they are moving towards becoming a transactional storage solution. The biggest challenge for Snowflake is going to be focus. After years of sucking on the teat of massive enterprises can they get their team to sell an SME offering?
RPA
RPA (Robotic Process Automation) software has changed a lot over the past decade. Instead of being just a way to get data into and out of legacy systems using its user interface, RPA is now a fully-fledged integration tool.
They have a few things working in their favour for becoming a credible Business Integration Hub. Firstly, they have large developer communities that spend their days interacting with standard business data objects like customers, suppliers, invoices, support tickets etc. This development community has a deep understanding of the problems of transforming data from one form to another. Secondly, the RPA companies all have marketplaces that would provide a way to distribute connectors made by their community and even a way to compensate their community for building connectors.
But, like Snowflake, sales and distribution may be a problem. Competing with Integration Hubs might be too big of a jump for the RPA sales teams to make.
Challenges
Aside from the risks mentioned above the big challenge is walking the line between simple enough to be easily used and broad enough to handle most integration scenarios. Anyone who has followed a Hello World tutorial knows that tech can appear easy but then get very hard very quickly as you step through the next few examples. My advice to Hubs: Lean into this. Make the easy stuff easy and provide functionality for more expert users to do the hard stuff for you.
A good example of this is Merge.dev’s recently announced passthrough functionality. If you have your accounting data sitting in Merge.dev and you need to connect to a financial system that is on Merge.dev, but the endpoint you need is not yet configured for Merge.dev; you can still utilise their authentication features but bring your own connector.
Future
The future of Integration Hubs as an integration pattern looks bright. The benefit of storing data in simplified formats for most integration tasks (immediate integration in some instances and simplified transformations when immediate integration is not possible) will be impossible to pass up. Increasingly, the role of the integrator will be to integrate between Hubs rather than between software systems.
Time will tell whether the current crop of Hubs will ultimately succeed but the economics of using a Business Integration Hub definitely works in their favour.