Stop losing winnable deals

Boost win rates, increase average deal size, and increase revenue per rep with Gong, the #1 revenue AI platform.

Thank you for your submission.

How AI Can Unlock The Power Of Unstructured Data

Share
Table of Contents

Stay up-to-date with data-backed insights

Thank you for your submission.

Sales AI Sales Leadership

This article was originally published on Forbes.com

Getting strategically valuable insights out of unstructured data has long been a corporate holy grail—even more so now in the age of AI.

This is largely because the digitization of business has led to a proliferation of data sources—videos, audio clips, images, texts—that all contribute to an unfathomably large trove of data waiting to be analyzed and acted upon. Deloitte estimated that the number of digital data bits in 2020 roughly equaled the number of stars in the universe.

The problem: Unstructured data—data from different sources in different formats and subject to different cataloging conventions—is notoriously difficult to analyze and, even more so, to mine for valuable insights.

At first glance, quick-minded large language models (LLMs) like GPT, trained on incredibly large datasets, look like just the fast, automated, superhuman tool we need to coax rhyme and reason from unstructured data. Wouldn’t it be great if we could just hand over the data to an LLM and ask it to unearth trends, predictions and other key business insights?

AI may someday get to this point and help businesses leverage the (conservatively) 80% of data that is unstructured. However, businesses shouldn’t assume that just because LLMs are on the tip of every executive’s tongue, and just because you can literally ask them anything, they are a technology magic bullet.

Making the most of unstructured business data with generative AI still requires some strategic consideration.

Four Considerations When Turning Unstructured Data Into AI-Generated Business Insights

1. You need to get the data right.

As obvious as it may seem, the first challenge is to get ahold of all the right data to input into an LLM.

When dealing with structured data, the data is often in a modeled computer system: a relational database, a data warehouse or some other information system.

In contrast, unstructured data tends to be distributed across different systems, locations, and sometimes, formats and versions. Unless you can systematically capture the majority of those unstructured conversations, the resulting analysis will end up being partial.

Other domains have similar challenges. Because analysis of unstructured data was deemed harder, companies often didn’t invest the resources to collect and categorize it, let alone make it readily available to AI systems. There is an abundance of write-ups located on people’s (virtual or physical) drives and in email communication; there’s data in spreadsheets, without any related business context. Without access to this data, any analysis will be incomplete.

2. You need to translate the unstructured data into structured data.

Even the simplest business questions start with correlating discrete, structured values.

For example, if you’d like to know whether win rates increase when salespeople offer a discount, you have to start by answering two questions: what the win rate is (a discrete value, often located in a database system), and whether a discount was offered (a discrete value, hidden inside unstructured data).

Unless you can mine the discrete values out of the unstructured data (in this instance, sales conversations), you wouldn’t be able to apply almost any kind of data analysis—drill-downs, aggregations, predictions and more.

The situation becomes even more complex when it comes to ordinal values beyond “yes” and “no.”

For example, to analyze company expenses, you’d have to figure out what category each expense maps to. In practice, some expenses would map to more than one category. Some categories may be standard—like meals—but others may be specific to your business.

3. LLMs won’t perform magic.

Popular LLMs are trained to generate the most probable next word in a sentence, hence the term “generative AI.” They are not optimized for so-called classification tasks, which are the mechanism for converting unstructured data to discrete values.

Of course, you can ask an LLM to classify documents by providing it with a document and asking what category or categories it maps to, but the results may be mediocre.

First, they lack context that’s specific to your business. Is a certain legal change acceptable to your business? Are the terms and conditions mentioned in such a legal document supported by the business?

Then, LLMs simply lack intrinsic accuracy needed for such tasks. LLMs are trained mostly on textual data and can thus answer textual questions (“How do I make an omelet?” “Can you give my story a funny title?”) that result in synthesizing text. In many cases, asking an off-the-shelf LLM to categorize even simple data can generate results that are less than 50% accurate. This may be sufficient for some basic tasks (hence the proliferation of summarization capabilities), but it would be irresponsible to make business decisions based on such low accuracy.

4. Business value comes from contextualized, domain-specific systems.

Off-the-shelf LLMs do a mediocre job of converting unstructured data into discrete values for further analysis. On the other hand, AI models can be adapted to help organizations make such analysis possible. This adaptation process is referred to as fine-tuning: taking an open-source LLM—called a foundational model—and layering on additional, domain-specific data, to optimize it for a certain business domain. Such fine-tuning can happen at an industry level (e.g., healthcare), a profession level (e.g., support agents) or even for one’s company.

Once optimized for a specific domain or task, these AI systems can reliably handle unstructured documents. The process starts by leveraging the fine-tuned LLMs to process the unstructured documents and convert them into discrete values. Then, analysis can be made available for business users using conventional or domain-specific business intelligence software. At that point, data can also be analyzed further by data analysts or data scientists.

Final Thoughts

When we’re able to harness unstructured data for generative AI effectively, companies will unlock new value. The hype around generative AI makes it sound like this is a plug-and-play process, but in reality, it’s still a complex task. Yet, the effort is worthwhile: It will significantly enhance the ability of businesses to extract actionable insights from their data, revolutionizing how they operate and compete.

Want more revenue leadership advice and data-driven insights? Subscribe to The Edge.