Generative AI’s LLMs Hallucinate by Design, so How can we Trust Them?

August 11, 2023

Moving from storyteller to journalist with Large Language Models (LLMs)

What is an LLM?

Generative AI is the area of artificial intelligence research devoted to all forms of natural language content generation.  An LLM, or Large Language Model, is a Generative AI model trained on vast amounts of text specifically to produce text completions. That is, LLMs are trained to generate human-like content based on the input it receives, allowing them to (seemingly!) answer questions, write essays, summarize text, translate languages, and even generate creative content such as poetry or stories. Large language models are incredibly valuable tools due to their ability to complete complex language tasks, streamline communication, and promote efficiency. They can be utilized to automate customer service, draft documents, provide tutoring, and many other applications. 

Every business’ velocity is governed by the speed at which humans can read and write. Generative AI is taking away that governor.

What are the LLM Hallucinations Issues all about?

LLM hallucinations, as they are often referred to, are instances where the language model generates text that is not factually accurate or diverges from the information given as input. These hallucinations manifest in different forms, ranging from subtle inaccuracies to outright fabrication of details. For instance, large language models can ‘imagine’ or ‘hallucinate’ a fictional character in a factual historical account. This propensity of LLMs to hallucinate information has been highlighted as a major concern among users. In fact, a recent report by McKinsey & Co. revealed that 56% of the survey respondents identified inaccuracies in LLMs as their number one concern. Many of these same respondents seek to mitigate hallucinations to build greater trust with large language models.

LLMs Hallucinate because they are Storytellers

LLMs hallucinate because that is their very design. These models are fed with vast amounts of content and trained to understand and mimic the linguistic patterns within this data. Their ability to generate engaging content comes from understanding the relationships between words and concepts in the training data. They learn to predict what text makes sense in keeping with the genre of its input, not what is ‘accurate’, by following patterns in the data they’ve been trained on. This process inherently involves extrapolation, and it’s in these moments that LLM hallucinations happen. LLMs aren’t fact-checkers or knowledge management systems; they’re more akin to skilled storytellers, weaving together coherent and engaging narratives based on the patterns they’ve learned. Hence, hallucinations aren’t a malfunction but rather a testament to an LLM’s ability to generate creative, human-like text.

It’s called “Prompting”, not “Querying”

We should not expect to make the equivalent of a database query from an LLM. We are ‘prompting’ it to guide where we want it to go. For instance, we prompted an LLM with the following:

“Write a story about a baseball player named “Jerry.” Jerry is a rookie and is leading the league in stolen bases”

Prompt to LLM

A Tale of “Jerry the Rookie”

Once upon a time in the heart of New York City, there was a rookie baseball player named Jerry. Jerry was unlike any rookie the league had seen. With a swing as swift as a falcon’s dive and agility akin to a gazelle, he was setting the diamond ablaze. Despite being new to the high-profile league, Jerry was fearlessly leading in stolen bases.

The crowd at Yankee Stadium would fall into a hushed anticipation every time Jerry stepped up to the plate. His eyes would narrow, focusing on the pitcher as though they were the only two people in the world. The moment the pitcher’s arm swung, it was as if time itself slowed down. Jerry’s bat would meet the ball in a sweet, resonating crack that echoed throughout the stadium. Then, like a lightning bolt, he would be off, darting for first base with a speed that left spectators and opponents alike in awe.

In his rookie season alone, Jerry stole more bases than any other player in the league, quickly earning him the nickname “Jerry the Jet.” Yet, despite his skyrocketing success, Jerry remained humble. He was the first to arrive at training and the last to leave, always striving to improve. He attributed his success not to natural talent, but to determination, hard work, and the love for the game.

Jerry’s story is a testament to the fact that, while hallucinated by the LLM, can be engaging, rich in detail and, surprisingly human-like, in its narrative even though it is completely made up and is factually incorrect. It is a hallucination. This is the value of an LLM: its ability to generate text with detailed and convincing narratives, not its knowledge of facts.

Once the story was written we asked the same LLM to write a prompt for DALL-E to produce images. The images that DALL-E created are below. As you can see, there are errors and certain hallucinations in the art.

LLM generated prompt for DALL-E

From Storyteller to Journalist

Consider a journalist. She gathers data from a variety of sources in and about the topic of her story. She then uses that corpus of information to address specific questions her readers want to know about. She is generating textual answers from that content. LLMs have that same capability, but not the journalist’s ability to assemble the correct data.

So, when you ‘prompt’ an LLM with a question, you need to provide, in that very same prompt, the relevant facts that the LLM will navigate to generate the desired answer.

How can that work?

Let’s take a specific example based on our product Krista. Krista integrates people, systems, and AI in a nothing-like-code, conversational way. When you leverage Krista’s knowledge management and real-time data access capabilities, then feed that contextually relevant information to an LLM in support of a question being asked, we get the journalist, not the storyteller. In fact, Krista uses clever fact-checking and human feedback steps to ensure the model strictly adhered to the information presented.

For example, if the question asked is “What are our best opportunities for this quarter?”, the LLM would not know the answer, but it does have implicit training that a higher likelihood to close is good, and higher contract values are good, so it’s capabilities just need to be fed with the right information from external sources to generate a reasonable answer. An automation/integration platform like Krista will pull data and additional context from Salesforce or other CRM systems, then use the LLM to construct a contextual answer for that specific user or role. Realize that even the scope of what opportunities should be considered is based on who is asking. The regional sales manager means his transactions and is not authorized to see other sales managers’ deals, while the CSO/CRO means for the whole company. Krista understands the context of the question, queries the systems of record, fetches the data based on the context established, and then prompts our ‘journalist’ for an answer. This approach is the most flexible and effective, as it allows for the orchestration of back-office processes across numerous systems and models.

Where are we doing this?

Don’t fall into the trap of creating large software development projects that cannot possibly keep up with the velocity of change in the LLM world nor support the integration appetite of your user community. Krista is being used to deliver solutions like those below in days or weeks, never months.

To highlight the potential of integrating LLMs, consider the following use cases:

Contact Center Agents

LLMs can potentially transform contact centers, adding value to all parties involved – customers, agents, and the organization. For customers, intelligent self-support capabilities powered by integrated LLMs allow for immediate issue resolution, which significantly enhances customer satisfaction. Agents can use integrated LLMs as a powerful tool for information retrieval, helping them navigate various systems and data stores quickly and efficiently. From an organizational perspective, integrated LLMs can increase operational efficiency, reduce costs, and ultimately drive revenue growth, by automating responses to common queries and freeing up agents to handle more complex issues.

Human Resources

Generative AI and integrated LLMs are revolutionizing Human Resource (HR) operations, driving greater efficiency and transforming employee experiences. These technologies empower HR assistants, enabling them to provide comprehensive, instant support to employees and reducing the load on HR teams. They streamline complex processes like onboarding, changing benefits, or scheduling time off through natural, conversational interactions.

Risk Management

Krista can become the nervous system of your business. Consider every audit-worthy item in your organization is read and understood at machine speed and integrated with your GRC content and processes. Policy outreach, third-party assessments, internal audits, and enabling the entire organization to be more aware and in compliance are possible.

Healthcare Agents

The healthcare sector can greatly benefit from the integration of LLMs and generative AI, enhancing the experiences of healthcare providers, back-office staff, and patients alike. Integrated LLMs can streamline complex tasks such as revenue operations and insurance reimbursements, reducing administrative burdens, and the potential for human error. Virtual assistants equipped with LLMs can support doctors and nurses by fetching relevant and contextual patient and drug information, aiding healthcare professionals in providing timely and personalized care. LLMs can also power virtual agents to provide self-service wellness programs, delivering personalized, accessible, and convenient healthcare advice to patients or employees.

Virtual Engineers / ITSM Support

LLMs and generative AI can help speed IT service management and IT operations by driving efficiencies and creating high-value opportunities. They can operate on top of automation or orchestration platforms, such as Krista, to deliver meaningful outcomes and automate the automation as part of an automation fabric.

26-page AI Buyer's Guide Now Available

Close Bitnami banner
Bitnami