BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage News Setting up a Data Mesh Organization

Setting up a Data Mesh Organization

A data mesh organization: producers, consumers, and the platform. According to Matthias Patzak, the mission of the platform team is to make the lives of the producer and consumers simple, efficient and stress free. Data must be discoverable and understandable, trustworthy, and shared securely and easily across the organization.

Matthias Patzak gave a talk about data mesh platforms at FlowCon France.

In the InfoQ article How Data Mesh Platforms Connect Data Producers and Consumers, Patzak explained the concept of a data mesh:

A data mesh is an organizational paradigm shift in how companies create value from data where responsibilities go back into the hands of producers and consumers. It eliminates the specialized data organization as a proxy and bottleneck in the communication between producers and consumers.

Patzak mentioned that there are three roles in a data mesh organization: producers, consumers, and the platform. Producers are teams that generate transactional data with their applications, such as the web store or ERP system, he said. In the data mesh approach, they extend their applications so that they can easily use their transactional data for analytical use cases.

Consumers are teams that generate insights from analytical data, such as marketing, finance, or sales. They build their own business intelligence (BI) or analytics applications, Patzak said.

To ensure that this is done efficiently and in a reasonably standardized way, there is also the data platform, as Patzak explained:

It provides tools and infrastructure that make life for producers and consumers simple, efficient, and stress-free.

The data mesh platform consists of teams that support the producers and consumers. They should provide tooling and infrastructure, training and consulting, and governance and security, Patzak said.

Patzak referred to the book Data Mesh in Action by J. Majchrzak et al., which defines a data product as "an autonomous, read-optimized, standardized data unit containing at least one dataset (Domain Dataset), created for satisfying user needs". He adapted this definition:

A "data product" is a self-contained dataset curated by a cross-functional team, designed to be valuable and usable for end users. Its purpose is to provide reliable, quality data that’s ready for analysis, facilitating better decision-making within the organization.

Patzak said that transforming data into a data product within a data mesh requires more than just raw data. It requires a quantum of data that includes metadata, code, configuration files, and infrastructure, in fact, infrastructure-as-code, he added.

Patzak mentioned that data must be discoverable and understandable, ensuring that it’s easy to find and that its content is clear to users from both a technical and organizational perspective. It should also be trustworthy, maintaining integrity and SLAs to ensure reliable use. Above all, it should be secure, allowing it to be shared securely and easily across the organization.

Simply building and providing a data product does not guarantee success or benefit, Patzak mentioned. You have to measure and prove the success of the data product. The best way to do this is with hypothesis-driven experiments in which A/B tests are used to measure whether the use of the data product delivers added value, similar to features in transactional applications, he concluded.

InfoQ interviewed Matthias Patzak about using a data mesh.

InfoQ: How can data producers and data consumers collaborate effectively?

Matthias Patzak: As written in the agile manifesto: "The most efficient and effective method of conveying information to and within a development team is face-to-face conversation." Producers and consumers of data should communicate, collaborate and co-create directly.

When you implement a data mesh, you can apply a lot of things that we as a community have already learned through agile, microservices, and especially DevOps, where two different parties work together successfully. Data mesh from a people and process perspective is not different.

InfoQ: How can consumers use data to generate value?

Patzak: Just like producers, consumers need to work backwards from their customers and their problems. The data product must generate very specific insights and enable human or automated decisions.

InfoQ: What can be done to make it easier to discover data products in a data mesh?

Patzak: In large organizations data catalogues are useful, comparable to a telephone book or a directory, but not to look up a specific data set and consume it straight away. Instead, the main purpose of a data catalogue is to discover the owner of the data product and provide some context about the data product. The next step is to reach out to the owner of the data product and communicate directly.

About the Author

Rate this Article

Adoption
Style

BT