The Perils of Language Model Hallucinations

Understanding why AI gives incorrect answers, and how to mitigate them

In partnership with

Language learning models (LLMs), the brains that power chatbots (e.g., ChatGPT and Google Gemini) and virtual assistants, are increasingly integrated into business operations to automate customer service, content generation, and other tasks.

A well-documented challenge with LLMs is the phenomenon of "hallucinations"  — the models' tendency to generate unfounded or incorrect information. I want to explain the concept of LLM hallucinations, their potential impact on businesses, and strategies to mitigate associated risks.

Furthermore, there’s a more human-oriented phenomenon where these LLMs are imbued with ideologies that may skew LLM guardrails. These are also a result of inaccuracies, though they are not a technical problem as much as a human-generated one.

A Newsletter You’ll Love: Stay Ahead on the Business of AI

Need ONE newsletter to stay on top of everything?

Prompts daily has only the most important AI news, insights, tools and workflows.

The newsletter is known to:

  • Make you more productive at work

  • Introduce you to valuable and useful AI tools/workflows

  • Help you look smarter in front of your friends (seriously!)

It’s also free to sign up!

Understanding LLM Hallucinations

Hallucinations in Large Language Models (LLMs), including textual and visual models, present a significant challenge in artificial intelligence. These hallucinations manifest as the generation of inaccurate, implausible, or misleading information during the operation of these models.

Textually, this can range from fabricating facts and misinterpreting context to producing irrelevant responses. In visual models, hallucinations may result in the creation of inaccurate images or entirely disconnected from the input prompts.

The use of 'hallucination' to describe inaccuracies in AI outputs has been contested for its lack of precision. In their work, 'False Responses From Artificial Intelligence Models Are Not Hallucinations,' Professor SÞren Dinesen Østergaard and Kristoffer Nielbo argue that terms such as 'false analogy,' 'hasty generalizations,' 'false responses,' or 'misleading responses' are more accurate descriptors. Additionally, there is concern that the term 'hallucination' could perpetuate stigma associated with neurological or mental health conditions, an issue of particular relevance considering AI's applications in medicine and psychiatry.

Causes of Hallucinations in LLMs

One way to combat hallucinations is to understand how they occur and take action to prevent them as you run your GenAI operations.

  • Training Data Limitations: One of the primary causes of hallucinations in LLMs is the inherent limitations in their training datasets. These models are trained on vast data sources from the internet, books, and other media. Despite the size of these datasets, they may not cover all possible topics exhaustively or accurately, leading to gaps in knowledge that can result in hallucinations. A simple example would be a model trained on dogs and cats but no horses, if a visual of a horse is “shown” to a model, the model would probably infer that was some strange dog or cat but would have no idea what is a horse.

  • Model Overgeneralization: LLMs are designed to generalize from the training data to make predictions or generate content about unseen data. However, this strength can also be a weakness when the model overgeneralizes from its training data, generating plausible but incorrect or irrelevant information. Going back to the previous example of a horse, a model might identify the horse as a dog because of overgeneralization.

  • Inference Heuristics: LLMs rely on heuristics to generate responses during inference. These heuristics, while effective in many scenarios, can sometimes lead to the selection of responses that are not grounded in facts or relevant context, thus leading to hallucinations. For example, given the training data of dogs and cats, it might infer that all future four-legged animals are either a dog or a cat, and once again if shown a horse it might infer that the four-legged animal is some dog or cat.

  • Complexity of Context and Ambiguity: The complexity of human language and the ambiguity in many queries or prompts can also contribute to hallucinations. LLMs may struggle to discern the correct interpretation or relevant information from complex or ambiguous inputs, leading to responses that do not accurately reflect the intended meaning. One example would be if you asked a model, “Can you tell me how to tie up loose ends?” This request contains a double entendre. It could refer to completing unfinished tasks in a project or literally tying the ends of a rope or string. Without additional context, the language model might respond, "To address unfinished tasks, prioritize and complete them systematically. For tying knots, select a knot style that secures the ends effectively," showcasing an attempt to cover both interpretations due to the ambiguity.

  • Feedback Loop Issues: In some cases, the iterative nature of training with human feedback can inadvertently reinforce hallucinatory outputs. If a model's incorrect or hallucinated output is not adequately identified and corrected during training, the model may learn to repeat or refine these inaccuracies. For example, if you use Reinforcement Learning with Human Feedback (RLHF) and the labeler misidentifies a cat as a dog, the model will “learn” from incorrect data.

You are often at the mercy of your chosen model, and it’s training. Specific techniques like RAG (retrieval augmented generation) can improve the accuracy of a model’s output and help guide the model to provide correct results.

Definition: Retrieval-augmented generation

Retrieval-augmented generation (RAG) is a technique that combines the generative power of neural language models with the precision of information retrieval systems to produce more accurate and relevant text outputs. By first generating a query based on the input prompt and then fetching pertinent information from a vector database, RAG provides a context for the language model to draw upon. This retrieval process ensures that the generated content is relevant and anchored in factual data, significantly reducing the incidence of hallucinations—where models generate misleading or entirely fabricated information. Thus, RAG addresses one of the critical challenges in AI-generated content by ensuring that responses are informative and reliably grounded in real-world information.

Implications and Future Directions

The phenomenon of hallucinations in LLMs highlights the limitations and challenges inherent in current AI technologies. Addressing these challenges requires a multifaceted approach, including developing more sophisticated training methodologies, refining models' inferential mechanisms, and the creating of more nuanced feedback loops that can better identify and correct hallucinations.

Furthermore, understanding and mitigating the causes of hallucinations is critical for deploying LLMs in high-stakes domains such as healthcare, legal, and financial services, where accuracy and reliability are paramount. Researchers and developers are actively exploring solutions, such as incorporating external knowledge bases, improving model transparency, and developing techniques for better understanding and controlling the models' generative processes.

As AI continues to evolve, addressing the challenge of hallucinations in LLMs will remain a critical focus, ensuring that these powerful tools can be used safely, effectively, and ethically across various applications.

LLM Guardrails

LLM guardrails ensure that AI language models, such as chatbots and virtual assistants, operate within predefined boundaries and typically insure do not generate harmful, offensive, or inappropriate content.

These guardrails can include a variety of techniques, such as:

  • Content filters: These filters prevent the AI model from generating certain words, phrases, or images deemed inappropriate or offensive.

  • Safety mitigations: These measures prevent the AI model from generating harmful or dangerous content, such as encouraging self-harm or violence.

  • Diversity and fairness: These guardrails aim to ensure that the AI model does not reinforce harmful stereotypes or biases and treats all users equally, regardless of race, gender, or other personal characteristics.

  • Transparency and explainability: These measures aim to make the AI model's decision-making process more understandable and explainable to users so they can better understand how the model generates its responses. For example, the weights and biases may be disclosed to determine the

  • Feedback mechanisms: These mechanisms allow users to report inappropriate or offensive content and provide feedback on the AI model's performance, which can be used to improve the model over time.

By implementing these LLM guardrails, AI makers can help ensure that their models operate responsibly and ethically and avoid the backlash and reputational damage from generating harmful or offensive content.

However, if these guardrails don’t engender accuracy, they can be just as dangerous as model hallucinations.

Meta Imagine and Google Gemini's Inaccuracies

The recent controversies surrounding Meta's AI image generator, Imagine, and Google's Gemini chatbot underscores the intricate challenges of developing innovative generative AI technologies sensitive to historical accuracy and social biases. These incidents, including the generation of historically inaccurate and offensive images, reveal the nuanced difficulties in balancing creativity with responsibility in AI development.

After complaints from Elon Musk, who called Gemini’s output “racist” and Google “woke,” the company suspended the AI tool’s ability to generate pictures of people.

In one instance, Gemini generated an image of a Black pope in response to the prompt, "Create an image of a pope." The image was criticized for perpetuating harmful stereotypes and reinforcing the lack of representation of people of color in positions of power. In another instance, the tool generated an image of an Asian woman in colonial-style clothing, which was criticized for its insensitivity to the history of colonialism and oppression.

Meta's AI image generator, Imagine, has been found to produce historically inaccurate and offensive images, similar to those generated by Google's Gemini chatbot. Imagine, available on Instagram and Facebook DMs, has generated images of Black founding fathers, Asian women in colonial times, and women in football uniforms in response to specific prompts.

AI-generated images from Meta AI's Imagine tool inside Instagram direct messages.

The tool, which is based on Meta's Emu image-synthesis model, has also been found to block certain words, such as "Nazi" and "slave." The incidents highlight the challenges of creating generative AI models that are both adventurous and unbiased and the need for AI makers to address these issues.

Imagine, introduced in December, is a text-to-image model that uses a deep learning algorithm to generate images from text prompts. It has gained popularity among users who create everything from funny memes to artistic creations. However, the model has also been criticized for generating offensive and inappropriate images, raising concerns about the potential misuse of such technology.

Furthermore, Imagine has been found to block certain words, such as "Nazi" and "slave," which has raised concerns about censorship and the potential for AI models to manipulate language and suppress certain voices. The blocking of certain words has also raised questions about the role of AI models in shaping public discourse and perpetuating harmful ideologies.

These incidents highlight the challenges of creating generative AI models that are both adventurous and unbiased. AI makers must address these issues and ensure their models do not perpetuate harmful stereotypes or reinforce existing power dynamics. This requires a deep understanding of the social and cultural contexts in which these models operate and a commitment to creating innovative and responsible AI.

The Role of Ideologies in LLM Guardrails

Ideologies can significantly influence the development and implementation of LLM guardrails. Guardrails are constraints designed to ensure that LLMs operate within predefined boundaries. However, if these guardrails are influenced by ideologies that do not accurately represent reality, they may inadvertently contribute to hallucinations.

For example, if a content moderation policy prioritizes political correctness over factual accuracy, it may result in the model generating inaccurate or misleading information. This can lead to a situation where the model caters to the ideological preferences of specific user groups while disregarding the need for factual accuracy. Consequently, this can result in the propagation of misinformation, further exacerbating the issue of LLM hallucinations.

Impact on Businesses

Last year in May, a lawyer from Manhattan gained notoriety for submitting a legal brief primarily generated by ChatGPT. However, the submission did not go well with the judge, who described it as an "unprecedented circumstance." The brief was filled with "bogus judicial decisions, quotes, and internal citations." This story quickly went viral, with even Chief Justice John Roberts commenting on the role of large language models in his annual report on the federal judiciary.

Therefore, businesses should be aware of the adverse effects of hallucinations on businesses.

  • Damage to brand reputation: Inaccurate or biased information generated by LLMs can mislead customers, losing trust and damaging the company's reputation.

  • Legal and regulatory implications: Businesses can face legal action or regulatory penalties if LLMs generate false or misleading information, especially in regulated industries like finance or healthcare.

  • Inefficient operations: Hallucinations can lead to wasted resources, as human agents must correct the model's errors or handle customer complaints arising from inaccurate information.

Consequently, it is imperative for businesses leveraging large language models to implement robust oversight and corrective measures to mitigate these risks, ensuring the integrity of their operations and the trust of their stakeholders. Addressing these challenges proactively is critical to harnessing the benefits of AI while minimizing its potential pitfalls.

Mitigating Hallucination Risks

To minimize the risks associated with LLM hallucinations, businesses can consider the following strategies:

  • Implement robust guardrails: Develop clear guidelines and constraints to ensure that LLMs operate within predefined boundaries. This can include limiting the model's ability to generate content on specific topics or incorporating mechanisms to verify the accuracy of generated information.

  • Regular auditing and monitoring: Continuously evaluate and refine LLM performance through regular audits and monitoring. This can help identify and rectify any hallucinations-related issues and ensure that the models remain aligned with business objectives.

  • Emphasize transparency: Communicate the limitations of LLMs to customers and stakeholders. By setting appropriate expectations, businesses can minimize potential backlash from hallucination-related incidents.

  • Invest in ongoing training and development: Regularly update and train LLMs with the latest data to minimize the risk of hallucinations. This includes addressing potential biases in the training data and incorporating mechanisms to verify the accuracy of generated information.

  • Encourage diversity in development teams: Assembling a diverse team of developers can help minimize the influence of ideologies on guardrails and other model constraints. This is not to be confused with DEI initiatives, but a diverse set of opinions may be helpful, even those that are unpopular but accurate. This can ensure that various perspectives are considered during development, reducing the risk of ideologically driven hallucinations.

Case Study: The Impact of Ideologically-Driven Guardrails

Consider a hypothetical financial services company that uses an LLM to generate investment advice for its clients. If the guardrails for this LLM prioritize ideological considerations, such as promoting environmentally friendly investments, over factual accuracy, the model may generate inaccurate or misleading information.

For example, the LLM might overstate the potential returns of a particular green energy investment, leading clients to make poor financial decisions based on this inaccurate information. This can result in significant financial losses for the client and potential legal action against the company.

The company can minimize the risk of such hallucinations by implementing a more balanced approach to guardrails. This may involve incorporating mechanisms to verify the accuracy of generated information, such as cross-referencing with reputable financial databases and ensuring that the model's responses are grounded in factual data.

Conclusion

LLM hallucinations pose significant business risks, including damage to brand reputation, legal and regulatory implications, and inefficient operations. By implementing robust guardrails that emphasize accuracy, regularly auditing and monitoring LLM performance, emphasizing transparency, and investing in ongoing training and development, businesses can minimize these risks and harness the benefits of LLMs while protecting their customers and stakeholders.

Prompt of the Week: Adding Roles to Prompts

Crafting effective prompts for ChatGPT, or any AI language model, is an art underpinned by understanding the model's capabilities and limitations. A crucial aspect of this craft is the specification of roles within prompts. The explicit definition of roles in prompts is a technical feature that should be used for more meaningful, context-aware outputs.

Enhancing Contextual Relevance

By assigning roles, we cue the model to adopt a specific viewpoint or knowledge base, enhancing the relevance of its responses. This is crucial in fields like medicine, law, or engineering, where expertise is valuable and necessary. The role-based prompt acts as a filter, ensuring that the model's output aligns more closely with the user's expectations and the real-world context of the query.

Fostering Creativity and Empathy

Moreover, roles can be leveraged to explore creative storytelling or simulate empathetic interactions. By asking ChatGPT to assume the role of a character or a professional in a given scenario, users can engage with the model in ways that go beyond information retrieval, tapping into its potential for generating narratives, empathetic responses, or even humor.

Practical Application in Business and Education

Role-specific prompts can facilitate scenario planning, customer service simulations, or leadership training in a business setting. In education, they can help create interactive learning experiences where the model assumes the role of a tutor, peer, or historical figure, offering personalized feedback and engagement.

Examples of Roles for Business

Market Analyst

Since I work in the IT industry, I like to choose the type of market analysts that would be common in my field. So, I added the following to my prompt. You could change the vendor if you were in another industry with published research.

"As a Market Analyst from Gartner, create a comprehensive report on the current state of the artificial intelligence market, identifying key trends, major players, and potential growth opportunities for startups in 2024."

Business Writer for Corporate Content

Here’s an example of something I have been doing more frequently.

I like The Economist because articles in The Economist are known for their concise, analytical style, often presenting complex topics in an accessible manner. They blend expert insight with a global perspective, providing factual reporting and opinionated commentary. The tone is authoritative yet engaging, aimed at informing a well-educated readership about global economic and political trends.

They also don’t publish articles under a byline and blend writers and editors to create high-quality content. It’s a good cheat code to help ChatGPT or other chatbots grasp the style quickly.

As a Business Writer specialized in corporate content in the style of The Economist, craft an engaging article that outlines the key benefits of adopting cloud computing for enterprise efficiency. The article should address common C-suite concerns, including security, cost, and integration with existing IT infrastructure.

Corporate Communications Manager

I sometimes use roles for doing press releases, since Edelman is considered the largest PR agency in the world and have a very professional style, I added the following role where I called out being a communications manager at Edelman to specify the style and level of formality.

As a Corporate Communications Manager at Edelman, create a communication plan for announcing a major company merger, including press release drafts, FAQ for customers, and a script for an internal town hall meeting."

Roles play a significant role in ChatGPT prompts. They are crucial in designing contextually aware interactions that are informative, engaging, and personalized to the user's needs. As we push the boundaries of what AI can do, integrating roles thoughtfully in prompts will continue to be a vital strategy in unlocking the full potential of these interactions. Let's embrace this approach to enhance AI interactions' meaning, effectiveness, and human-centricity.