The third principle of the framework, the Self-Serve Data Platform, mitigates a large part of the redundancy and inefficiency risks outlined in the first article on Data Mesh. This component of the Data Mesh architecture is precisely purposed to reduce the cognitive load of teams in individual Domains. It makes available to them a set of appropriately parametrized and locally integrated capabilities that guarantee an Agile and less complicated Data Product management.
The Self-Serve Data Platform capabilities can be logically organized on three interconnected planes:
- The entire Data Mesh experience plane provides the generalized features to search for Data Products, manage the Data Lineage, monitor the overall activities in the organization, and support compliance with global policies;
- The single Data Product experience plane provides the services for managing its life cycle (Data Product journey): implementation and publication, data quality, documentation and subscription, querying, and so on up to the disposal;
- The data infrastructure plane manages the data management primitives that the other planes use: query engine, persistence management, rule engine, identity management, etc.
These three planes communicate with each other and externally through APIs. A unified data platform that gives the Domains the generalized capabilities for Data Product production and consumption, limiting the complexity to the parameters that determine their behavior, expresses the declarative paradigm in practice. This data management approach is extremely efficient compared to the imperative one, where, on the contrary, the tool (or the programming language) requires the developer to explain every technical detail of the operations to run on the data. But what is the governance model of this framework? And, in particular, who manages its components (the Domains, the Data Products, the three planes of the Self-Serve Data Platform)?
The Principle of Federated Computational Governance
The fourth and last principle is Federated Computational Governance. It indicates the system of responsibilities that regulates the functioning of the Data Mesh: the processes and the logical components involved, the individual roles with their competencies. A federated governance model supports one of the key purposes of the framework: it gives the Data Domains the highest possible degree of autonomy and responsibility.
- The central level will only contain the supervision of the policies all Domains must respect (for example, compliance with regulatory requirements, data quality criteria taxonomy, data protection and security standards) and the Self-Serve Data Platform management.
- The level of individual Data Domains will see the building of interdisciplinary Data Domain Teams that will consist of the company employees with the most in-depth business and IT skills regarding the data managed by this Domain. The Team has the autonomy and responsibility over the operational data management within its competence, over its Data Product production and management, and the consumption and usage of Data Products it subscribed and has access to. It has to comply only with the global policies and rules valid at the global level, particularly those concerning the use of Self-Serve Data Platform capabilities. Once it meets this limitation, the Data Domain Team enjoys full autonomy in defining its internal policies and the characteristics of the Data Product it manages.
- In particular, Data Product Owners are set within the Domains. The task of a Data Product Owner is to manage one or more Data Products implemented within the Domain along their entire life cycle. They are also responsible for seeing that the Data Products meet the global policies and, in general, the seven minimum criteria: Discoverable, Addressable, Understandable, Trustworthy, Accessible, Inter-operable, of Value, and Secure. The physical implementation of a Data Product is entrusted to a Data Product Developer, also within the Domain.
- Then there are also Data Product Consumers. These are the profiles who use the Data Products to create reports, analysis, dashboards (data analysts); train and execute AI and ML models (data scientists); implement other Data Products (Data Product Developers); implement applications and services for operational management of the Domain (application developers).
- The Data Platform Team that manages the Self-Serve Data Platform consists of two main figures: a Data Platform Developer, who develops and manages the services provided by the three planes of the platform, and a Data Platform Product Owner. The latter, similarly to a Data Product Owner, manages the services the platform provides to the domains as if they were products, ensuring they are correct and comply with the declarative perspective along their entire life cycle.
The organisazional impact
In a functioning Data Mesh architecture, there are no CDO (Chief Data Officer) or CDAO (Chief Data & Analytics Officer) figures. A Data Mesh is centrally coordinated by a Federated Governance Team, which consists of representatives of Data Domain Teams, Data Product Owners, Data Platform Team, and subject matter experts on security, protection, compliance, and legal affairs. This federated team is responsible for defining global policies to be applied within all Domains and, if necessary, implemented in Data Platform Services.
Is it possible?
From this quick overview of the four principles of Data Mesh, it is easy to see the revolutionary character of this framework compared to models based only on the central management of analytical data. It brings relevant changes to the organizational model and the use of data management technologies. To the best of our knowledge, and at least in Italy, there are still few cases of practical application, and mostly they do not cover an entire organization. It can only be adopted gradually, following successive steps.
A first preparatory exercise to introducing it could be to isolate a self-consistent Data Domain and test on it the possibility of launching a project of an interdisciplinary team that aims to implement one or more deliverables in the form of Data Products. Based on the result of this experience, it will be possible to assess the feasibility of applying the Data Mesh on a larger scale.
However, it is important to emphasize that every business can find inspiring ideas in the four principles of this framework. We name but a few:
- differently balanced responsibilities when it comes to data treatment, favoring “local” roles for data in a given area as a result of a data literacy program and data culture promotion;
- a cross-functional (business and IT) approach to data asset management as the key to a robust data Governance model;
- attention to the accompanying metadata to ensure potential users (data consumers) have a correct understanding of the dataset characteristics (whether it is a report, a data stream, or an actual Data Product);
- tending towards the declarative approach when choosing or implementing data management tools to ensure the most efficient and sustainable user experience regardless of the competence level by making the capabilities less technically demanding.