So, what's really behind the hype?
Data mesh is primarily an organizational change.
The responsibilities of data are shifted closer to the business value stream.
This enables faster data-driven decisions and reduces barriers for data-centric innovations.
Who has actually implemented a data mesh?
There is a comprehensive collection of user journey stories from the Data Mesh Learning community
that covers data mesh examples from many different industries.
Is Data Mesh for my company?
It depends, of course.
There are a few prerequisites that should be in place:
You should have modularized your software system following domain-driven design principles or something similar.
You should have a good number (5+) of independent domain teams that have their systems already running in production.
And finally, you should trust your teams to make data-driven decisions on their own.
How to get started?
Start small and agree on the big picture.
Find two domain teams (that are around level 2) that have a high value use case where one team needs data from the other team.
Let one team build a data product (level 4) and another team use that data product (level 3).
You don’t need a sophisticated data platform yet.
You can start sharing the files via AWS S3, a Git repository, or use a cloud-based database, such as Google BigQuery.
When should I avoid a Data Mesh?
There are some indicators when a data mesh approach might not be suitable for you, including:
- You are too small and don’t have multiple independent engineering teams.
- You have low-latency data requirements. Data Mesh is a network of data. If you need to optimize for low-latency, invest in a more integrated data platform.
- You are happy with your monolithic highly integrated system (such as SAP). It might be more efficient to use their analytical platform.
Is the Data Mesh a generic solution to a distributed data architecture?
By definition, data mesh does not include data products used for serving real-time needs.
Data mesh focuses on analytical use cases.
What's the difference between data mesh and data fabric?
At first, data fabric looks similar to data mesh because it offers a similar self-serve data platform.
Looking deeper, it turns out that data fabric is a central and domain-agnostic approach, which is in strong contrast to the domain-centric and decentralized approach of data mesh.
More in this comparison article .
What might a journey be for teams who operate commercial off-the-shelf (COTS) systems?
Many COTS systems (such as Salesforce, SAP, Shopify, Odoo) provide domain optimized analytical capabilities.
So the journey for domain teams starts directly from level 2.
The challenge is to integrate data products from other domains (level 3, which may be skipped if not needed)
and to publish data products for other domains (level 4).
The system’s data need to be exported to the data platform and managed as data products, conforming the global policies.
As data models evolve with system updates, an anti-corruption layer is a must, e.g., as a cleaning step.
How might externally acquired datasets be part of a data mesh?
Typical examples: Price-Databases or Medical Studies.
A team needs to own this dataset and bring it into the datamesh. If this is not a very technical team, the data-platform should offer an easy self-service to upload files and provide Meta-Data. An Excel API or Google Sheets might also be an option here.
How did you draw the diagrams?
We got this question quite a lot, so we are happy to share our tooling:
We use diagrams.net with "Sketch" style and
Streamline Icons .
We automate the conversion to PNG, SVG and WebP with a little script.
What are your questions?
If you have any more questions, we encourage you to discuss with us on GitHub or reach out to us directly.
But be warned:
Your question might end up in the FAQ. :-)