Don’t be a Mess with the Data Mesh

https://www.infoworld.com/article/3402260/what-is-a-service-mesh-easier-container-networking.html

Data mesh is one of the hottest data management concepts in recent years, and many companies in Thailand have shown great interest in implementing the concept.

While companies such as Google and Databricks are on board the data mesh, they do not offer any turned-key solutions or commercial products. The most recent article on implementing Data Mesh by Databricks, “ Databricks Lakehouse and Data Mesh, Part 1,” tried to illustrate that Lakehouse is an excellent tool for implementing the Data Mesh. Still, companies interested in implementing Data Mesh must design and implement the concept themself, which often creates a messy data mesh.

To Mesh or not to Mesh

Why is it often a mess, and what to do about it? But before we get into that, let us look at the four principles of the data mesh logical architecture:

  1. Domain ownership: adopting a distributed architecture where domain teams — data producers — retain full responsibility for their data throughout its lifecycle, from capture through curation to analysis and reuse
  2. Data as a product: applying product management principles to the data analytics lifecycle, ensuring quality data is provided to data consumers who may be within and beyond the producer’s domain
  3. Self-service infrastructure platform: taking a domain-agnostic approach to the data analytics lifecycle, using standard tools and methods to build, run, and maintain interoperable data products
  4. Federated governance: ensuring a data ecosystem that adheres to organizational rules and industry regulations through standardization

Companies can use these principles as a framework to determine if investing in a Data Mesh makes sense. Below is a quick example of a company that should not yet be investing in a data mesh.

  1. A company has only one or two domains. This usually implies that the company has few data sources and functional teams to use the data products.
  2. A company with a small data team( DE, DS, and DA)to support. This is a bottleneck; each domain needs a data team to help create data products.
  3. A company that does not have existing data governance implemented. This implies that the company might not have a solid infrastructure to handle decentralized organizational rules, industry regulations, and standards.

The Data Mesh concept is designed to decentralize. The goal is to provide the domain with more flexibility to share, access, and manage analytical data in a complex, large-scale environment. If your company does not have the problem that Dehghani, Zhamak was trying to solve, then your company will most likely not benefit from the Data Mesh.

Don’t be a MESS

These are some of the “ Don’t ” keys I have observed firsthand for those companies that fit these criteria and want to implement the Data Mesh.

  1. Don’t be a hovering boss.
  2. Don’t practice centralized management on the domains
  3. Don’t ignore data observability
  4. Don’t focus first on technology

Do the Mesh

What is the role of the upper management and the central technical team in the Data Mesh paradigm? The following is not exhaustive. However, these are some “Do” keys that I found helpful while working on the data mesh project.

  1. Do support the domain’s needs.
  2. Do get to know each domain.
  3. Do promote data product quality metrics, governance, and standardization for each domain.
  4. Do promote autonomous provision.

Concluding Thoughts

The Data Mesh concept is here to stay. However, it is not a practical solution for a smaller company or company that does not have many functional teams or business units. While some of these “do and don’t” might be helpful for companies interested in getting into Data Mesh, they are a crude introduction to the problem of implementing the concept. There are still more details that each company must investigate carefully.


Comments

Leave a comment