Data Product: a solution to face the non-deployment of machine learning projects

Data Product Manager: a solution to the non-deployment of machine learning projects

Of the many machine learning projects launched, very few are actually implemented in production environments. This trend, which emerged from a survey by the specialised magazine KDnuggets, highlights the difficulties encountered by companies that do not adopt the right approach when trying to create applications based on artificial intelligence or advanced analytics. Commenting on the survey, Eric Siegel (founder of Machine Learning Week) speaks of ‘industry-wide failure’, which he attributes to the absence of mature leadership.

Why ML projects fail

According to the expert, the strong interest (‘excitement’) in ML projects is fully justified, but the problems lie almost entirely in the corporate approach, which is often more focused on the technology itself than on putting it into production. 114 data scientists responded to the survey, half belonging to companies that consume analytics; the other half were split between vendors, public companies, academia and other stakeholders.

The first question asked how many machine learning models had reached the deployment phase: 4 out of 5 respondents indicated less than 40 per cent (more than half – 58 per cent of the answers – even under 20 per cent of the models implemented).

These results, explains Siegel, are in line with previous research.  The second question investigated the causes of the failure to ‘ground’ ML projects. In two out of three cases, critical issues concern the relationship between the new models and existing operations, either because decision makers do not approve of the change (32 per cent) or because they get bogged down in technical difficulties (35 per cent) when implementing or integrating the new model into existing processes.

Supporting the work of data scientists

The data scientist, as a rule, does not have an easy time: often the data he needs to create any model (including machine learning) arrives late, perhaps in a different format than expected, or contains countless errors, sometimes difficult to detect, or-yet-not sure where the data is.

According to Gartner, poor data quality costs an average company $12.9 million each year.

That’s why a data management expert can prevent a project from getting off to a poor start, perhaps because of a lack of attention to data preservation and governance. The preparatory phase is critical to the success of any initiative: the contribution of professionals such as the data engineer becomes essential.

Data Product and Leadership

According to Harvard Business Review, data products (defined as “the attempt to create reusable datasets, which over time can be analyzed in different ways by different users, to solve a given business problem”) are a powerful answer – especially in “large legacy companies” – to this type of critical issue. This happens regardless of whether or not they integrate AI and/or analytics capabilities, because their “product orientation” still creates benefits either way.

The idea behind data products, HBR points out, is not entirely new. But its adoption can be decisive in those kinds of companies, because many of their data scientists “believe their work is done when they create a model that fits the data well.” This is why a new role, the data product manager, is being thought of, with different skills not only from “data scientists,” but also from chief data officers (CDOs).

This professional will therefore not only have to be able to manage the cross-functional development of a data product, its deployment, a multidisciplinary team and related tasks, like a classic product manager. They also need the ability to communicate effectively with business leaders, whose processes will be involved in all these changes.

Irion’s Vision

The phrase “data product” is now a central concept in modern data management, thanks (in part) to the international interest sparked by the Data Mesh framework. Moreover, it is cornerstone for a data-driven organization. How is it actually different from a standard dataset? What are its distinctive features and the contexts in which to apply it? How do you make (and more importantly, how do you use) a data product? Find out more in this deep-dive by Mario Vellella and Mauro Tuvo.

Discover the Irion EDM platform!

You may also be interested in