HomeData MeshData Mesh: Centralized ownership vs decentralized ownership

Comments

Data Mesh: Centralized ownership vs decentralized ownership — 9 Comments

  1. “domains thinking of their own data product” is a key issue. The primary consumer of a data product is likely to be the product owner. Having product owners take feature requests from other domains and scaling to meet the needs of other domains are but two of the many organizational challenges with mesh architectures. As they were with other distributed ownership architectures. Great article as always.

    • Thanks for this comment, it confirms my main concern with the data mesh: the concept assumes domains will go the extra mile to offer their data as data products for other domains to consume. But it remains unclear how these domains will be incentivized to go that extra mile.

  2. We are in the infancy of a data mesh that will need enterprise diligence in governance. We have used enterprise data engineers and domain experts to process clean and govern. Our success has come from data engineer leadership that empowers their domain teams to govern their data. Many companies have built the mesh but we have very few that could keep it going across the enterprise. The breakdown is always in the governance.

  3. Pingback:Data Mesh and Ownership Strategies – Curated SQL

  4. Thank you James for the candid critique of Data Mesh, as well as your general thought leadership. When the concept is boiled down, the technical challenges only exist because we cannot delegate software/data development to business stakeholders (even if we call then “engineers”); you called this point out above. Organizations simply are not geared to maintain architect and developer capabilities within the business function, nor should they be. What could be a better approach is relying on business functions for data governance and stewardship, but relying on enterprise IT to create the data glue. The more recent centralized model seems to have finally gained traction with products like Synapse and Snowflake. Augmenting these products with interoperability and governance feels like a better move than “data mesh” for the time being. What is the business case for data mesh vs. other modern approaches? (sincerely asked).

    • Data mesh tries to solve three challenges with a centralized data lake/warehouse:

      Lack of ownership: who owns the data – the data source team or the infrastructure team?
      Lack of quality: the infrastructure team is responsible for quality but does not know the data well
      Organizational scaling: the central team becomes the bottleneck, such as with an enterprise data lake/warehouse

      But as I pointed out in my blog you can certainly have a centralized approach without these challenges, without introducing the extra work and challenges a data mesh gives you.

  5. Hi James
    Your article is a good collation of challenges posed by Data Mesh. Federated computational governance can be challenging, as it may warrant a lot of self-discipline within each data domain.
    Viewing from an Agile team organisation perspective, if you have Chapters comprising of team members cutting across the Squads (Data Product teams), you can at least make an attempt to ensure all domains have a healthy inter-operable eco-system. I think Data Mesh relies heavily on the assumption that each domain exposes its metadata (along with its operational and analytical data) which must be well defined for other domains to consume. I have made an attempt to take the Data Mesh paradigm a small step forward via my blog on Medium using a hypothetical use case in a healthcare setup. Hope it is worth having a read.
    https://medium.com/capgemini-microsoft-team/data-mesh-implementation-in-a-multi-cloud-architecture-ac2a7b089789

  6. In the past 20 years we talked about data mart, hub and spoke, federated queries as a solution to analytics – rather than fixing issue with centralized analytical data store. Now, it is data mesh. This too shall pass.

  7. Hello,

    I don’t see it as big issue and we have completely valid implementation available. Data Mesh no different to CQRS pattern (eg. Saga, Event Sourcing patterns). But while CQRS is introduced to API communication. Data Mesh is no different but just for things we call Data Warehouses/Lakes/Hubs, and now Mesh.

    I tried in my last company to explain that there should be single platform template in form of CQRS and it should be integrated everywhere. The only centralized stuff is the Event Store, so solution to speed this up is to create a Snapshots. And you get it.

    So imagine you your standard 3-tier app design, eg. front/back/storage, right? So next time you will start developing new APP, the back and storage will be handled by Data Mesh, so your api storage and data lake storage become one 🙂 That is the real beauty behind all this. And this is what I call microservices, or more event driven functions. So logic is distributed as functions flow. As everything is an event, it can be transformed into any kind of domain as CQRS demonstrate in form of Projections/Aggregates.

Leave a Reply

Your email address will not be published. Required fields are marked *

HTML tags allowed in your comment: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>