Back to Glossary Catalog

RAG (Retrieval-Augmented Generation)

Dynamically feed external, live business data directly into a foundation model during the prompt cycle.

Academic Definition

Retrieval-Augmented Generation (RAG) is an architectural pattern that enhances the capabilities of a Large Language Model (LLM) by dynamically injecting custom external documents directly into its context window before generation. Instead of relying solely on the static knowledge the model learned during pre-training, a RAG system runs a real-time semantic search to retrieve relevant text chunks from local databases or vector stores (like Pinecone, Chroma, or pgvector) matching your query. It then wraps these documents into a prompt template, enabling the model to answer with up-to-date, specialized business data without requiring expensive model fine-tuning.

Practical Application & Code Structure

Standard RAG Engineering Workflow:

  1. Ingest & Chunking: Convert internal PDFs, Markdown files, or Databases into raw text. Segment these documents into overlapping chunks (e.g., 512 tokens with a 10% overlap) to preserve context.
  2. Vectorization: Pass chunks through an embedding model (like OpenAI's text-embedding-3-small) to generate high-dimensional vectors.
  3. Storage: Index these vectors inside a specialized Vector Database.
  4. Query & Search: When a user inputs a query (e.g. "What is our company's refund policy on course cancellations?"), convert the query into a vector and perform a cosine-similarity search.
  5. Prompt Construction: Inject the top-scoring text chunks directly into the LLM system instructions:
system_instruction = f"""
Use the following context to answer the user query:
---
CONTEXT:
{retrieved_chunks_text}
---
If the answer is not in the context, respond: 'I am unable to answer.'
"""
  1. Execution: The LLM compiles a highly precise, hallucination-free answer based strictly on the injected context.

Related Certification Programs

Featured Editorial Articles

Explore More Technical Concepts

Academic Integrity & Authority

Vetted Technical Explanations

Every term in our AI glossary is authored and reviewed by experienced data scientists and senior MLOps engineers to match standard technical paradigms and commercial industry terminology.

🎓 Verified Curriculum

Curriculum content aligned directly with real-world programming frameworks.

🛡️ ISO Standard

Quality-tested explanations designed to prevent conceptual hallucinations.

💼 Job Ready

Equipping learners with exact enterprise terminology used in modern dev teams.