What is RAG (Retrieval Augmented Generation)?

Guy Korland
Date Published: September 6, 2023
Date Updated: October 28, 2024

In this blog post, I will explain what is RAG, why it is useful, and how to build it using Vector Database and Knowledge Graph as a leading option for RAG. I will also give some examples of use cases that need RAG and how it can improve the quality and accuracy of the generated text.

Large Language Models (LLMs) are powerful tools for natural language processing, capable of generating fluent and coherent text for various tasks. However, LLMs also have some limitations, such as their knowledge base being stale, incomplete, or inaccurate. To overcome these challenges, we can use a technique called Retrieval Augmented Generation (RAG), which allows us to provide LLMs with relevant and up-to-date information from external data sources.

In this blog post, I will explain what RAG is, why it is useful, and how to build it using Vector Database and Knowledge Graph as a leading option for RAG. I will also give some examples of use cases that need RAG and how it can improve the quality and accuracy of the generated text.

What is RAG?

RAG is a process for retrieving information relevant to a task, providing it to the language model along with a prompt, and relying on the model to use this specific information when responding. For example, if we want to generate a summary of a news article, we can use RAG to retrieve related articles or facts from a database and feed them to the LLM as additional context. The LLM can then use this information to generate a more accurate and informative summary.

RAG is different from fine-tuning, which involves training the LLM on new data to adapt it to a specific domain or task. Fine-tuning can be time-consuming, expensive, and not offer a significant advantage in many scenarios. RAG, on the other hand, allows us to use the same LLM as a general reasoning and text engine, while providing it with the necessary data in real time. This way, we can achieve customized solutions while maintaining data relevance and optimizing costs.

What should a RAG provide?

To implement RAG, we need two components: an LLM and a data source. The LLM can be any pretrained model that supports text generation, such as GPT-3 or T5. The data source can be any collection of documents or facts that are relevant to our task or domain. However, not all data sources are equally suitable. Ideally, we want a data source that is:

– Up-to-date: The data should reflect the latest information available on the topic of interest.

– Comprehensive: The data should cover all the aspects and details that are relevant to the task or domain.

– Accurate: The data should be reliable and trustworthy, free of errors or biases.

– Efficient: The data should be easy to access and query, with low latency and high throughput.

How to build RAG using Vector Database and Knowledge Graph?

One of the leading options for building such a data source is using a combination of Vector Database and Knowledge Graph. A vector database is a database that stores data as vectors, which are numerical representations of objects or concepts. A knowledge graph is a graph that stores data as nodes and edges, which represent entities and their relationships. By combining these two technologies, we can create a powerful data source that meets all the criteria above.

A vector database allows us to store and retrieve data based on similarity or relevance. For example, if we want to find documents that are related to a given query, we can use a vector database to compare the query vector with the document vectors and return the most similar ones. A vector database also enables fast and scalable queries, as it can leverage efficient indexing and search algorithms.

A knowledge graph allows us to store and retrieve data based on semantics or meaning. For example, if we want to find facts that are related to a given entity, we can use a knowledge graph to traverse the graph from the entity node and return the connected nodes and edges. A knowledge graph also enables rich and structured queries, as it can leverage logical inference and reasoning.

By combining a vector database and a knowledge graph, we can create a data source that can answer both similarity-based and semantics-based queries. For example, if we want to find information about COVID-19 vaccines, we can use a vector database to find documents that are similar to our query, and then use a knowledge graph to extract facts from those documents. This way, we can obtain both relevant and informative data for our task.

To build RAG using these data sources, we need to follow these steps:

1. Preprocess the data: We need to transform our raw data (e.g., text documents) into vectors and graphs. We can use various methods for this step, such as word embeddings, sentence embeddings, document embeddings, entity extraction, relation extraction, etc.

2. Store the data: We need to store our vectors and graphs in a vector database and a knowledge graph respectively.

3. Query the data: We need to query our data source based on our task or prompt. We can use various methods for this step, such as natural language queries, keyword queries, vector queries, graph queries, etc.

4. Generate the text: We need to provide the LLM with the query and the retrieved data as context, and ask it to generate a response. We can use various methods for this step, such as prompt engineering, few-shot learning, zero-shot learning, etc.

Examples of use cases that need RAG

RAG can be useful for many use cases that involve text generation, especially when the LLM’s knowledge is insufficient or outdated. Here are some examples of such use cases:

– Summarization: RAG can help generate summaries of long or complex texts, such as news articles, research papers, books, etc. We can provide the LLM with additional information from related sources, such as other articles, facts, opinions, etc. This can help the LLM generate more accurate and informative summaries that capture the main points and perspectives of the text.

– Question answering: RAG can help generate answers to factual or open-ended questions, such as trivia questions, homework questions, customer queries, etc. We can provide the LLM with relevant information from authoritative sources, such as Wikipedia, databases, experts, etc. This can help the LLM generate more precise and reliable answers that address the question and provide evidence or explanation.

– Content creation: RAG can help generate creative or original content, such as stories, poems, songs, jokes, etc. We can provide the LLM with inspiring information from diverse sources, such as literature, art, music, culture, etc. This can help the LLM generate more novel and interesting content that reflects the style and theme of the task.

Conclusion

In this blog post, I discussed RAG, why it is useful, and how to build it using vector database and knowledge graph as a leading option. I have also given some examples of use cases that need RAG and how it can improve the quality and accuracy of the generated text.

If you are interested in learning more about or trying it out yourself, you can check out Building a Q&A System & Building and Querying a Knowledge Graph .

Guy Korland

Guy Korland serves as CEO at FalkorDB, where he drives graph database architecture for generative AI and retrieval-augmented generation workflows. He holds a PhD in Computer Science from Tel Aviv University and brings over 20 years of experience in database engineering. He previously led Redis’ incubation arm as SVP & CTO, oversaw platform architecture as GM & CTO at Stor.ai (Self-Point), co-founded and served as CTO of Shopetti, and directed R&D as VP at GigaSpaces.