WHY THIS MATTERS IN BRIEF
The volume of information in the world is growing exponentially and data scientists can’t keep up with managing or sorting it all, but AI can …
We all know that at the moment Artificial Intelligence (AI) thrives on data – that is until it doesn’t need data to learn new things anymore, something called Zero Shot Learning which is already emerging … For the moment though the general rule of thumb for AI is that the more data it can access, and the more accurate and contextual that data is, the better the results will be.
The problem is that the data volumes currently being generated by the global digital footprint are so vast that it would take literally millions, if not billions, of data scientists to crunch it all, and even then it still would not happen fast enough to make a meaningful impact on AI-driven processes. So it shouldn’t come as a surprise then that many organisations are now getting AI to scrub its own data and using it to automate data scientists jobs.
According to Dell’s 2021 Global Data Protection Index the average enterprise is now managing ten times more data compared to five years ago, with the global average skyrocketing from “just” 1.45 petabytes in 2016 to 14.6 petabytes today. And, with data being generated everywhere – in the datacenter, the cloud, the edge, and on connected devices around the world – we can expect this upward trend to continue well into the future.
In this environment, any organisation that isn’t leveraging data to its full potential is literally throwing money out the window. So going forward, the question is not whether to integrate AI into data management solutions, but how.
AI brings unique capabilities to each step of the data management process, not just by virtue of its capability to sift through massive volumes looking for salient bits and bytes, but by the way it can adapt to changing environments and shifting data flows. For instance, according to David Mariani, founder of, and CTO at AtScale, just in the area of data preparation, AI is already getting great at automating key functions like matching, tagging, joining, and annotating. From there, it’s then becoming increasingly adept at checking data quality and improving integrity before scanning volumes to identify trends and patterns that otherwise would go unnoticed – all of which is particularly useful when the data is unstructured.
One of the most data-intensive industries is health care, with medical research generating a good share of the load. Small wonder, then, that Clinical Research Organisations (CROs) are at the forefront of AI-driven data management, according to Anju Life Sciences Software. For one thing, it’s important that data sets are not overlooked or simply discarded, since doing so can throw off the results of extremely important research.
Machine learning is already proving its worth in optimizing data collection and management, often preserving the validity of data sets that would normally be rejected due to collection errors or faulty documentation. This, in turn, produces greater insight into the results of trial efforts and drives greater ROI for the entire process.
Still, many organisations are just getting their new Master Data Management (MDM) suites up and running, making it unlikely they will replace them with new intelligent versions any time soon. Fortunately, they don’t have to. According to Open Logic Systems, new classes of intelligent MDM boosters are hitting the channel, giving organisations the ability to integrate AI into existing platforms to support everything from data creation and analysis to process automation, rules enforcement, and workflow integration. Many of these tasks are trivial and repetitive, which frees up data managers’ time for higher-level analysis and interpretation.
This trend toward deploying AI to manage the data it needs to perform other duties in the digital enterprise will change the nature of work for data scientists and other knowledge workers. People will no longer be tasked with doing the work they do now and instead will focus on monitoring the results of AI-driven processes and then making changes should they veer from defined objectives.
More than anything, however, AI-driven data management will speed up the pace of business dramatically. Data is king in the digital universe, and kings don’t like to wait … but it also raises the question: What happens when AI manages its own data and humans are no longer in the loop!?