Featured
Governance Hub: the heart of Reale Group’s Data Governance
Matching of Listed Derivatives
Reinsurance – Stop Loss
Calculation of port performance fees

Data Quality: What It Is, why It matters, and how to implement It in your business

irion-data-quality

Is your latest Generative AI initiative not delivering the expected results? Do your business reports require days of manual reconciliation? The problem probably isn’t algorithms or people, but the foundation: the quality of your data. We don’t build houses on cracked foundations, so why should we build our business on unreliable data?

According to Gartner, back in 2020 (well before the current “rush” to implement AI solutions) poor data quality was already costing organizations an average of about $11 million per year. That estimate later rose to $12.9 million (+19%), with an even more troubling finding: 60% of companies surveyed in 2024 didn’t know what the actual economic impact of this situation was for them.

But the real damage, today more than ever, is strategic: slower decisions, eroded trust, and above all the inability to seize the opportunities offered by Artificial Intelligence, as highlighted by research on AI and Data Quality conducted for Irion by Politecnico di Milano (Big Data & Business Analytics Observatory), which found that, in the Italian context, 3 out of 4 companies risk being caught unprepared.

If five years ago talking about data quality meant laying the groundwork for governing an organization’s information assets, today it means building a “launch pad” for generative AI, Data Observability, and the creation of reliable Data Products. Data Quality is increasingly moving out of its purely technical role to become the lifeblood at the heart of the business.

Furthermore, Article 10 “Data and data governance” of the AI Act — the legal framework adopted by the European Union on Artificial Intelligence — explicitly sets out obligations regarding data quality, governance, and traceability for “high-risk” AI systems. Examples include: banking and insurance scoring for credit access or premium calculation, personnel selection, infrastructure management, biometric systems, automated assessments, support for judicial decisions, predictive policing, and border control.

According to the Global Data Management Community (DAMA), data quality “consists of the planning, implementation, and monitoring of activities that apply data quality management techniques to ensure that data is fit for purpose and meets user needs.” Speaking of DAMA International, the global kickoff for the work to update the DMBOK® (Data Management Body of Knowledge) to its third version took place in the summer of 2025.

What Is Data Quality: definition and key dimensions

When we talk about data quality, we are referring to the set of activities, rules, and controls that enable an organization to have accurate, reliable, and usable data for business processes, reporting, and feeding artificial intelligence models. To effectively assess the quality of data, the industry relies on six standard dimensions: completeness, uniqueness, timeliness, validity, accuracy, and consistency.

Why should a company implement a Data Quality system?

Gartner noted as early as five years ago (Gartner: “5 Steps to Build a Business Case for Continuous Data Quality Assurance,” April 20, 2020, Saul Judah, Alan D. Duncan, Melody Chien, Ted Friedman) that estimates of economic losses due to “poor data” were set to rise “as business environments become increasingly digitized and complex.”

Information is the foundation of every business process, and the quality of the data that is collected, stored, and used inevitably shapes the business of the organization today and tomorrow. “Poor data quality destroys business value,” emphasizes Gartner, because they provide the information that forms the basis of knowledge and generates business insights, which in turn lead to competitive advantages and ensure a strong market position. We can therefore compare them to the foundation of a house: only if it is solid can we expect it to withstand even earthquakes.

Birth dates entered as 2190, sequences of identical numbers used as VAT numbers, and addresses consisting solely of a street name. These are just a few of the anomalies found in a company’s databases, but while an incorrect address may result in a missed opportunity to contact a customer or potential customer—leading to a loss—the consequences are far more serious when incorrect data is used to determine a risk profile.

Even more dangerous is providing management with reports containing inaccurate data, which can lead to “skewed” strategic decisions and negatively impact the organization’s financial performance. This also fosters a deep mistrust of data among employees, undermining both the credibility of the data and its use.

To remain competitive, it is essential to establish a data quality assurance system that ensures reliable information for its intended business use, while adhering to process deadlines, and to be able to implement well-designed diagnostic measures and structural corrections to address any anomalies identified.

What are the methods of application and the main control criteria?

Implementing a data quality system is a long-term endeavor. Going into detail about each phase would require writing a book, but in summary, we can outline a few key steps:

  • The establishment of a company policy that sets out the “rules of the game” for all parties involved;
  • the identification of a pilot scope within which to identify the data present in the various stages of the process, on which to perform appropriate transformation and validation steps using a rule-based system—including rules expressed in natural language (technical rules, e.g., verification of data format compliance; and business rules, e.g., a paid-off loan has a zero balance; or reconciliation rules, after appropriately normalizing the data to be compared);
  • keep the system fully operational and monitor data quality trends by establishing a set of supporting indicators;
  • take any necessary steps to address the identified issues and make structural improvements;
  • expand to new applications.

The most time-consuming part is likely the design of control mechanisms that verify data compliance with a set of criteria, generate results, and enable the detection of anomalous data. In 2013, DAMA UK (DAMA-DMBOK Chapter 13) identified six dimensions around which technical and business controls should be aligned:

  • Completeness: the percentage of data stored relative to the full potential of 100%;
  • Uniqueness: No instance (object) of the entity will be recorded more than once based on how that object is identified;
  • Timeliness: the extent to which the data accurately reflect reality at the time they are needed;
  • Validity: Data is valid if it conforms to the syntax (format, type, range) specified in its definition;
  • Accuracy: the extent to which the data correctly describes the “real-world” object or event being described;
  • Consistency: the absence of any discrepancy when comparing two or more representations of a “thing” with a definition.

Data Quality vs. Data Observability: what’s the difference?

The criteria listed above obviously represent only a selection from a broader set of data quality criteria documented in the literature. While the six dimensions tell us “what” the state of data health is at a given moment, Data Observability helps us understand “why” and prevent future problems. It is a holistic approach that constantly monitors data health throughout all pipelines, not just in the final database. As Gartner puts it, it is “an organization’s ability to have broad visibility into its data landscape and multi-level dependencies.”

“Monitoring” data pipelines means being able to track not only static quality but also data freshness, volume, schema, and lineage, essentially in real time. This makes it possible to detect anomalies (such as a sudden drop in the number of records in a data stream) before they impact business processes.

In other words: we could think of traditional data quality as an annual medical checkup, providing a snapshot of your health; whereas data observability is more like a fitness tracker, a smartwatch that monitors your vital signs around the clock, alerting you to a potential anomaly (such as an irregular heartbeat) as soon as it occurs and allowing you to take action before it becomes a more serious problem.

How should Data Quality metrics and indicators be structured?

The proper operation and performance improvement of the Data Quality system depend on the availability of a set of metrics: you can’t improve what you don’t measure. A metrics system must reflect the primary information needs:

  • It is advisable to identify a few key metrics and focus reporting efforts on them. While it is true that “you can’t manage what you can’t measure,” it is also true that “measuring costs money”;
  • In general, metrics should be integrated into the system as much as possible—that is, they should be cohesive and linked to one another through a logical framework. It is always advisable to ensure consistency in the terminology and definitions of metrics;
  • the system must be balanced, that is, it must include various types and perspectives, weighted according to their representativeness;
  • it is a good idea to present the metrics grouped by similar categories or types;
  • The purpose of metrics in a data quality system is not to measure individual productivity or quality, or to encourage competition among individuals or departments, but to measure the quality of the product (the data) and the processes. For example: instead of measuring the number of validated data flows per day, it is better to measure the number of error-free data flows. Measuring individual performance can be tempting, but it is certainly one of the most harmful things for a quality initiative. The only acceptable performance measurement, if any, is at the workgroup level;
  • a metric should always be empirically validated in a variety of contexts before being published.

Why does Irion EDM enable the creation of an effective Data Quality framework?

As the number of information areas to monitor and the number of checks to manage, define, perform periodically, and measure increases, a quality system finds indispensable support in the use of technological tools to automate the most demanding tasks, such as performing periodic checks, calculating quality metrics, and generating reports.

Irion has completed hundreds of projects in this area, developing a platform that reduces setup time and speeds up the implementation of control procedures while fully complying with company policies. Any examples?

  • powerful control engines that perform 2.5 million checks per minute, verifying over 60 million records;
  • an effective system for managing data remediation and issues related to poor data quality;
  • a module that allows you to adopt metrics that have already been tested by various organizations or to define, calculate, and analyze any type of indicator for any type of business process;
  • automated processes that intelligently generate technical rules from metadata in just a few seconds.

Irion EDM is currently used by 8 of Italy’s 10 largest banking groups, half of the country’s leading insurance companies, and major enterprises and complex organizations across sectors such as energy, utilities, transportation, logistics, and public administration. Here are a few examples within the scope of advanced Data Quality solutions:

  • Major financial institutions use our platform on a daily basis to verify the consistency and formal accuracy of sensitive tables in massive databases containing millions of transactions, ensuring compliance with mandatory reporting requirements and minimizing the risk of penalties
  • An Italian banking leader uses an Irion-based “Accounting Reclassification Engine” to automatically classify 400 million daily transactions, ensuring consistency between accounting records, financial statements, and regulatory filings, while complying with the strict requirements of Bank of Italy Circular 285
  • For a major financial institution, Irion manages the process of automatically deleting personal data after a certain period of time (the “right to be forgotten”), working across various critical databases in the finance and treasury sectors and ensuring full compliance with the GDPR

Irion EDM and the modern approach to quality control

Irion is the only Italian company included in Gartner®’s 2025 Magic Quadrant™ for “Augmented Data Quality Solutions.” A modern data quality framework requires a powerful and flexible engine like Irion EDM, the platform that enables this strategic vision by allowing you to:

  • Automate and scale: perform millions of checks per minute on tens of millions of records
  • Automatically generate rules: leverage metadata to create automated validation rules, accelerating deployment
  • Engaging the Business: Providing intuitive interfaces for rule management, remediation, and data justification

Ignoring data quality today means building your digital future—one based on AI, data-driven decisions, and automation—on a fragile and unstable foundation. It means feeding your algorithms “junk data,” resulting in unreliable outcomes. It’s not enough to patch up the cracks: we need to repair the very foundations of our house (business). Beyond the traditional dimensions of data quality, embracing the new paradigm of observability and the power of automation is a crucial first step toward transforming data from a hidden risk into a core asset for the company.

FAQ

What is Data Quality?

It is the discipline that measures, monitors, and ensures the reliability of business data, guaranteeing that it is complete, accurate, consistent, timely, and valid for its intended purposes.

What are the dimensions of data quality?

The six reference dimensions, as defined by DAMA UK, are completeness, uniqueness, timeliness, validity, accuracy, and consistency.

Why is data quality important for artificial intelligence?

Because an AI model is only as reliable as the data it is trained on and used with: incomplete, duplicate, or invalid data leads to unreliable results, regardless of the algorithm’s quality. Furthermore, the AI Act imposes specific data quality and governance requirements for high-risk AI systems.

What is the difference between Data Quality and Data Observability?

Data Quality measures the health of data at a given point in time, much like a snapshot. Data Observability continuously monitors the health of data pipelines in real time (freshness, volume, schema, lineage), enabling anomalies to be detected before they cause problems.

Where do you start when building a data quality system?

The process begins by establishing a shared company policy, identifying a pilot area where control rules will be applied, ensuring the system operates smoothly using monitoring indicators, and gradually expanding the framework to new areas of the company.

Related links

DAMA – The Global Data Management Community

Discover through practical examples how other successful organizations have already begun their transformation.

Scroll to Top