Adapting content for AI: Improving accuracy of RAG solutions
Retrieval augmented generation (RAG), is one of the most common generative AI solutions that corporations are implementing. RAG is a framework that forces the large language model (LLM) to answer questions based on the documents you feed it to improve the accuracy of the answers. However, most of the information I’ve found on implementing RAG assumes that if you set up the framework properly, the LLM will automatically generate accurate and complete answers. But what about considering how the actual content of the documents might affect the solution?
I’m the strategist for IBM watsonx.ai documentation, where we’ve had a RAG solution (running on watsonx.ai, of course) to answer questions based on our docs since July 2023. We’ve been evaluating the answers and analyzing trends. I’ve also collaborated across other teams at IBM to discover potential problems in content that is consumed by RAG solutions and to develop guidelines to mitigate those problems.
We’ve discovered that the structure, format, terminology, and quality of content all have a significant effect on the quality of the answers provided by LLMs in a RAG solution.
Most instructions for implementing RAG list these requirements:
- One or more documents.
- A way to search through your documents and retrieve the relevant content. A common practice is to use a vector database to chunk and index your content.
- A way to preprocess your documents into the text format that LLMs understand.
- An LLM that’s optimized for RAG.
Those requirements get you a working RAG solution, but for higher quality results, you need to include these missing requirements: understanding your content and adapting your content both proactively and reactively.
Understanding your content
You need to understand your content before you implement RAG, so that you can answer these questions:
Is your preprocessing script effective for your content?
For example, suppose you have conceptual graphics with alternative text that describes the graphics for accessibility requirements. If you implemented RAG with a preprocessing script that doesn’t include the alternative text, then you need to either update your script or add that information in your main text so that the LLM can access it.
Is your LLM is the best fit for your content?
For example, suppose you have tables that indicate whether something is supported with a checkmark and not supported with an empty cell. Not all LLMs can correctly interpret checkmarks to mean “supported” and empty cells to mean “not supported”. If you implemented RAG with an LLM that doesn’t understand checkmarks and empty cells, then you need to either switch LLMs or update your tables with text that the LLM does understand.
Adapting your content
You need to adapt your content both before and after you implement your RAG solution:
- Proactive plan: Update any content that can’t be preprocessed into a form that the LLM can understand.
- Reactive plan: Update your content in response to incorrect or inadequate answers from the LLM. You need to set up a process to review LLM answers and evaluate whether content should be updated to improve answers.
Stay tuned for more blogs about adapting content for AI where I’ll show specific, tested examples of how to improve RAG solutions.