A Lapse From Model-Centric To Data-Centric AI

Recently, AI has taken off the ground and has been bringing revolutionary changes in the industry. Its influence has been seen in many aspects of businesses. Many methodologies and algorithms with varying degrees of sophistication have been developed to address a variety of problems and designed to concentrate on the technical aspects of problem-solving. So, the emphasis lies on the coding part of the problem. However, any AI solution built to solve a problem consists of two parts — algorithm and data. The recent Data-centric AI campaign launched by Andrew Ng tries to emphasize that the models have achieved quite a good amount of sophistication and it’s high time we put more focus on the quality of data.

What is Data-Centric AI? And How it Helps Data-Driven Businesses?

The core idea of Data-centric AI is that no amount of fine-tuning can fix bad data. Many of the models presently in use have high levels of complexity and can solve complex challenges. But in case the data is incorrect or not clear enough, the model will learn as presented. Therefore, Andrew Ng proposes to focus more on data, a new methodology where the model is kept the same, and the data is modified iteratively. Precisely, the model can be effectively notified using high-quality data. For this to work well, a proper and deep understanding of data is crucial. This is quite important because what helps to solve a business problem is a solid understanding of the problem itself. This will help us to systematically engineer data, and this can come only when there is clarity on data.

Characterizing the Aspects of High-Quality Data

Consistency:

Metadata:

High quality of data is essential to develop a clearer understanding of the problem. It orients the decision-making process to be more data-driven rather than technique-driven. Proceeding with this solution requires closer collaboration with the subject matter experts. As a result, the solutions model can be developed in a way that allows Data Scientists to comprehend and manage how the model learns. It will almost certainly lead to the development of better solutions and an improvement in their performance.

In Data-centric AI, the philosophy is aimed at the best utilization of data which requires clear standards set up from the beginning i.e., the data collection. It can motivate businesses to standardize data collection and different processes across their value chains. This will streamline the data management, which in turn will make accessing, monitoring, and analyzing data to build solutions a lot easier.

Data-centric AI brings in a bag full of benefits. Since this paradigm requires a deeper understanding of data, it can easily be integrated with the preprocessing of data, which usually takes up a massive amount of time in building a solution. As a result, the resource allocation for training in the Data-centric paradigm could be far less as it doesn’t require a lot of fine-tuning of hyperparameters. These are the benefits of Data-centric AI, to name a few.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Affine

Affine is a provider of analytics solutions, working with global organizations solving their strategic and day to day business problems https://affine.ai