The term Metadata comes from the prefix meta (from the Greek preposition μετὰ, “above”) and the Latin data.
But what exactly is it?
The Italian Accademia della Crusca defines metadata as markers, a kind of post-it notes, attached to an object (image, document, web page, etc.) or a set of objects with the aim of describing its contents and/or attributes.
On the other hand, Gartner Glossary immediately highlights the importance of what the term represents:
Metadata is information that describes various facets of an information asset to improve its usability throughout its life cycle. It is metadata that turns information into an asset. Generally speaking, the more valuable the information asset, the more critical it is to manage the metadata about it, because it is the metadata definition that provides the understanding that unlocks the value of data.
In the literature, there is a distinction between structural metadata, which defines data architecture and interrelation, and content metadata, which classifies and describes the data.
Furthermore, according to the most classical taxonomy, we can classify metadata into three distinct, but interconnected types:
- business metadata (collected in a business glossary, e.g., business terms, ownership semantics, related processes, rules)
- technical metadata (entered in a metadata dictionary, e.g., physical fields, lengths and formats, applications, automatic controls)
- operational metadata (e.g., streams input, completion of transformation processes, outcomes of controls in a given period)
These metadata categories communicate with each other via relations, such as vertical Lineage, which maps business terms to the fields they represent in computer systems. The interconnection of these three areas of governance is essential for data asset management.
What is it for? Why is it convenient to use it?
Metadata contributes to the processes of data storage, data protection and control, data processing, data integration, data governance. In short, it helps a business to understand its data, its systems and workflows.
To get a better idea of the importance of metadata in data management, imagine a huge warehouse with hundreds of thousands of goods without an inventory. An inventory bears not only the characteristics of the product but also its location, so without it, it would take the workers a very long time to find a specific good. New employees would perhaps not even know how and where to start the search. The inventory does not just provide the necessary information (which goods are in stock and where they are stored). It also allows finding what is necessary from different starting points (type, name, size, availability). An organization without metadata is like a warehouse without an inventory. The amount of data in a company is very significant and constantly growing. Without metadata, it is impossible to manage it as a resource, or rather, impossible to manage the data efficiently and effectively.
Metadata regularly finds use in Data Management disciplines, for example:
- in data privacy, to guarantee quick identification of private or sensitive data
- in Data Quality, to swiftly detect redundant and low-quality data types or to identify the most appropriate controls
- in Data Governance, to classify who can see certain data (by leveraging metadata attributes related to confidentiality, ownership and permissions management) or to construct Data Lineage
- in data discovery, to find the data with certain characteristics
- in data orchestration and in many emerging disciplines such as Data Valuation, Adaptive Data Governance, projecting a Data Fabric architecture, or in DataOps that uses metadata to improve data value and usability in a dynamic environment
What is Metadata Management? What is active metadata?
And how do I manage metadata? Metadata Management as a discipline was designed to illustrate the most appropriate ways of making the most of metadata potential.
However, a new concept has recently been established in this field. Now Data Management platforms can transform metadata, which traditionally was just collected and thus passive, into active metadata, i.e., it can automatically enable some functions, reducing the engagement required from Data Specialists.
To sum it up, active metadata is the metadata by analyzing which one can find opportunities for easier and more optimized processing and use of data assets. These include log files, transactions, user logins, query optimization plans.
For example, based on the features of a metadata element, it is possible to:
- suggest to the Data Steward possible Data Quality rules
- recommend data to categorize as sensitive for protection of privacy
- drive data pipeline executions
- for example, use the “user logins”1 metadata to automatically notify groups of users when new data assets similar to those already visible to them become available
- and much more…
With Irion EDM, you can easily manage your business metadata and generate its value:
- you can activate your metadata and save your time for the more valuable tasks
- you can drive the Enterprise Information Management system functions (Preparation, Quality, Transformation, Masking, Discovery, etc.) thanks to active metadata functionality
- you candefine a free and entirely configurable metadata model. Not having to worry about immediately setting a definitive structure or having metamodels already available for your Business Glossary, you can realize it faster
- you can prepare and deliver information services based on available metadata to answer recurring questions (from the management, business or IT analysts, data scientists, DPO) on the company’s data assets or to comply with regulatory requirements;
- you can quickly respond to unexpected requests both from colleagues and from internal and external inspection bodies (impact analysis, data quality, etc.)
- access a variety of sources thanks to Irion EDM connectors
- manage all technical and business data related to the company’s data assets in a Data Catalog. Also, manage the related interconnections without performance problems that the volumes of metadata may cause
- you can automate Data Governance reporting (Business Glossary, Enterprise Data Catalog) and more (control system documentation, processing log, etc.)
- organize the collaboration of several teams that work (even concurrently) on the model structure and content using an exclusive Enterprise change management system