What is RAG | Retrieval-Augmented Generation?

Emad Dehnavi
3 min readAug 22, 2024

RAG or Retrieval-Augmented Generation, is a technique for enhancing the accuracy and reliability of generative AI models with facts fetched from external sources, which combine the powers of pretrained dense retrieval (DPR) and sequence-to-sequence models.

Large language models can be inconsistent. There are times that their answer is spot on and there are cases that when you read their answer, you’re like “What? That’s does not make any sense!” and RAG is how we can improving the quality of LLM-generated responses.

Rick Merritt, describe RAG it as the court clert of AI:

To understand the latest advance in generative AI, imagine a courtroom. Judges hear and decide cases based on their general understanding of the law. Sometimes a case — like a malpractice suit or a labor dispute — requires special expertise, so judges send court clerks to a law library, looking for precedents and specific cases they can cite. Like a good judge, large language models (LLMs) can respond to a wide variety of human queries. But to deliver authoritative answers that cite sources, the model needs an assistant to do some research. The court clerk of…

--

--