Data Quality Is Now the Primary Factor Limiting GenAI Adoption
- By Brett Kahnke and Michele Goetz, Forrester
- March 11, 2024
When generative artificial intelligence (GenAI) burst into prominence with the release of ChatGPT in 2022, technically savvy business users quickly began experimenting. At that time, existing tools were limited to a fairly select set of use cases, and trustworthiness was low. With few ready-to-use GenAI apps available, a lack of know-how was a primary impediment to pursuing GenAI solutions, especially for more specialized business cases.
The core technology for text-based GenAI is the large language model (LLM). The complexity and resources required to create LLMs put them out of reach for most companies. Today, businesses looking to implement GenAI use cases have a selection of existing LLMs available for them to use and customize. The barrier to entry has dropped within reach of almost any mature technology team. For companies that lack the skills or ambition to work directly with LLMs, software platforms across virtually every business function now offer out-of-the-box GenAI features for daily use, with little to no specialized skills required.
If 2023 was the year of GenAI experimentation, 2024 is shaping up to be the year to launch GenAI into production solutions to serve customers and customer-facing roles. The ability to summarize vast quantities of unstructured data and generate creative content (including text, images, and even video!) is intriguing to revenue officers and the broader executive team. Forrester's September 2023 Artificial Intelligence Pulse Survey showed that 70% of B2B companies are already using GenAI, and another 20% are exploring its use.
The common denominator is the quality of the data
A lot can go wrong between the user request, interpretation of the question, how the response is generated, and how the response is communicated back to the user. Regardless of which technical path your business pursues, the primary limiting factor you'll face today is your data quality. The old adage "garbage in, garbage out" is even more true for GenAI. GenAI places unprecedented strain on your capabilities for data governance for several reasons:
- GenAI consumes data at a new level of speed, scale, and complexity. Data and operations teams managing traditional business use cases focus on curating and cleansing defined data sets. GenAI consumes structured and unstructured data, allowing access to insight generation at a speed and scale never before seen, including data types that most businesses do not actively manage.
- GenAI uses data to generate insights unpredictably. Measurement and analytics teams are accustomed to controlling the gateways through which end users query available data. Insights are delivered through reports and dashboards, each offering a curated experience with limited scope. GenAI grants access to a vast data repository and will make intuitive leaps to support user queries. Data management teams can no longer predict which data must be cleansed to deliver accurate insights because they no longer control which questions are being asked.
- Security, privacy, and consent require new processes that don’t exist today. Managing data security and privacy in traditional business use cases relies on controlling source data. Data that violate compliance standards is purged, and data security relies on approved users controlling access to specific data sets. GenAI models do not rely on active queries of source data to fulfill requests. Once training data has been ingested, data teams can no longer easily control which users can access which data elements. Security and compliance depend on knowing each end user's appropriate level of access. No current standard linking GenAI models back to their source data creates new levels of uncertainty and risk. In the AI Pulse Survey mentioned above, data privacy and security concerns are seen as the most significant barriers to the adoption of GenAI by B2B enterprises.
The data challenges for GenAI demand a different approach to data quality
Managing data quality for GenAI use cases demands a different set of skills from operations teams. It also requires retraining teams to manage new concepts. At a high level, this mind shift has a few key themes:
- Operations teams must be more closely aligned with technology resources. This partnership of technical skill and business insight is critical to generating trusted responses from any GenAI tool.
- Data steward roles must expand their ability to provide domain expertise, taking on new roles as the arbitrators of accurate insight generation.
- Data management must move from the cleansing and control of discrete data sets into the ongoing, active curation of conversations, both prompt and response.
The original article is here.
The views and opinions expressed in this article are those of the author and do not necessarily reflect those of CDOTrends. Image credit: iStockphoto/champpixs
Brett Kahnke and Michele Goetz, Forrester
Brett Kahnke, Forrester’s principal analyst, is passionate about helping clients learn new strategies and develop their internal capabilities to drive success. His research builds on more than 25 years of diverse experience in marketing, operations, analytics, and technology.
Michele Goetz, Forrester’s vice president and principal analyst, serves enterprise architects, chief data officers, and business analysts trying to navigate the complexities of data while running an insight-driven business. Her research covers artificial intelligence technologies and consultancies, semantic technology, data management strategy, data governance, and data integration.