Simple tricks to improve Retrieval Augmented Generation (RAG) systems

2 minute read

Published:

Motivation

Retrieval Augmented Generation (RAG) is a framework for retrieving facts from an external knowledge base to ground large language models (LLMs) on the most accurate, up-to-date information. RAG is increasingly popular in industry as it’s simple to implement yet powerful. Here I’ll share some tricks to improve RAG systems.

Use structured text over PDF text

RAG systems typically preprocess the data as chunks of text, embed them, then store them in the search index. If the data is in PDF format, we need to additionally convert the PDF to text. However, this preprocessing can be noisy. If the data is in structured format, we can use a parser to get a much cleaner text. For example for docx, we can use pandoc to conver them to clean text.

Here’s the quantitative results for RAG on (my) PDF documents:

MetricSingle HopTable RelatedMulti HopWeighted Avg
Cosine Similarity0.79930.81210.80120.8036
Rouge1 F10.33710.33560.31210.332
Rouge1 Recall0.58470.56080.44230.5506
Retrieval F10.60420.70.56670.6186
Retrieval Recall0.812510.55560.7692

Here’s the results for RAG on pandoc documents:

MetricSingle HopTable RelatedMulti HopWeighted Avg
Cosine Similarity0.78290.8450.82280.8098
Rouge1 F10.34490.52250.46190.4223
Rouge1 Recall0.49840.70460.53720.5701
Retrieval F10.60420.80000.50000.6170
Retrieval Recall0.81251.00000.50000.7436

Prepending document title to your chunks

Every document has a document title (eg file name) and is chunked. One way to improve retrieval is to prepend the document title to each chunk.

Here’s the quantitative results without prepending:

MetricSingle HopTable RelatedMulti HopWeighted Avg
Cosine Similarity (Instructor-XL)0.81140.85590.86670.8371
Rouge1 F10.32920.48590.42660.4013
Rouge1 Recall0.58290.63350.67210.617
Retrieval F10.71110.81820.41670.6364
Retrieval Recall0.86670.90910.41670.7179

Here’s the results with prepending:

MetricSingle HopTable RelatedMulti HopWeighted Avg
Cosine Similarity (Instructor-XL)0.83810.87010.81350.8445
Rouge1 F10.39670.48480.31510.4117
Rouge1 Recall0.66710.68060.56230.6521
Retrieval F10.80000.90910.50000.7209
Retrieval Recall0.93331.00000.50000.7947

Credits to Shin Youn for this tip

Converting flat tables to JSON format

For flat tables (i.e. mxn table, with only one value per cell) which are long, the entries near the bottom of the table may be too far away from the headers. One simple trick is to convert these flat tables to JSON format, such that the key is the table header and the value is the cell value.

Credits to Jun How and Qian Hui for this tip.