How AI can ease those data management woes
Data is the new oil, but raw data is no good in and of itself. Like oil, data assets have to be gathered entirely and accurately and sent through different refining processes to create value for end users. This is the general data lifecycle — an area where artificial intelligence (AI) is going to play a major role for enterprises.
Initially, managing the data lifecycle was a task small enough to be handled manually by a team of experts. The volume of information was not that much, the sources were just a handful and the possible applications were also limited. But with the transition to the cloud, and the introduction of new sources, both the volume and diversity of data have surged.
“Data management is no longer wholly focused on relational data,” Adam Ronthal, research VP in Gartner’s ITL data and analytics group, told VentureBeat. “Document, graph, time-series, wide-column, key-value, ledger and other targeted data stores all provide specific optimizations for different types of data, and different use cases. Sometimes, these are combined in a single data management platform — a multimodel database; sometimes, they remain as best-fit, targeted point solutions.”
This increase in volume and diversity of information has rendered traditional ways of data management ineffective. Today, a company that selects, manages and optimizes (cleaning and enhancing) each dataset component individually will end up wasting a lot of time — cleaning and transformation alone can take days or weeks — and capital.
The situation is comparable to Yahoo having used human experts to manually evaluate and catalog a deluge of web pages. The company dedicated plenty of resources but could evaluate only a small portion of the internet and struggled to keep the evaluations up to date.
Bringing AI into data management
Just as Google with its automated algorithms took over internet domination from Yahoo, evaluating web pages more quickly and at vastly lower cost, today AI is set to revolutionize the data lifecycle.
According to Ronthal, applications of AI in data management rely on metadata analysis and activation. This allows the model to detect deviations in data usage from system design and (ideally automatically) correct them. This is augmented data management: using AI/ML to automate and optimize data management, allowing organizations to spend less time managing and optimizing infrastructure and more time building core business value.
Many organizations have already started using AI- and ML-driven techniques to touch various components of data management, bringing improvements in speed and cost-efficiency.
For instance, in January 2023, Google and Aible, a company bringing an AI-first approach to the data journey, worked with a Fortune 500 enterprise and enabled it to analyze over 75 datasets with over 100 million rows of data across 150 million variable combinations. The total compute cost: $80, less than a thousandth of the cost of traditional methods.
Aible also published 25 case studies with Intel highlighting how enterprises across geographies and verticals benefitted from AI in less than 30 days and drove value across functions.
Overall, Ronthal notes, AI augmentation can have an impact on multiple disciplines of data management, including:
- Metadata management: Here, AI and ML can be used to explore and define the data’s metadata, evaluating metadata faster and more accurately, with reduced redundancy. Similarly, augmented data management functions can automatically catalog data elements during data extraction, access and processing.
- Data integration: AI can be used to automate the integration development process, by recommending or deploying repetitive integration flows, such as source-to-target mappings.
- Data quality: AI and ML can be used to extend profiling, cleansing, linking, identifying and semantically reconciling master data in different data sources.
- DBMS: In addition to enhancing performance and cost-based query optimization, AI and ML can automate many current manual management operations, including managing configurations, elastic scaling, storage, indexes and partitions, and database tuning.
- FinOps: AI and ML can be applied to budget and cost optimization problems and make recommendations about resource usage, pricing models, and second- and third-order effects of making changes in highly interconnected environments.
Priya Krishnan, head of product management for data and AI at IBM, highlighted similar applications.
“AI is being used to ingest, identify and classify datasets from a variety of sources,” she said. “It continuously mines content to surface unseen patterns and trends, providing organizations with greater visibility and actionable insights to aid in decision-making. Businesses are using AI to automate otherwise manual tasks like data capture, de-duplication, anomaly detection and data validation. They are also training models to apply regulatory policies and ethical standards automatically, ensuring those principles are embedded from the beginning.”
Resource : https://venturebeat.com/data-infrastructure/how-ai-can-ease-data-management-woes/