6/8

Lesson 6 12 min

Generation: Grounding and Citation

Prompt LLMs to stay grounded in retrieved context, provide source citations, handle incomplete information gracefully, and avoid hallucination in RAG responses.

Premium Course Content

This lesson is part of a premium course. Upgrade to Pro to unlock all premium courses and content.

Access all premium courses
1000+ AI skill templates included
New content added weekly

← Back to course overview

Retrieval found the right documents. Now the LLM must use them correctly — grounding its answer in the provided context, citing sources, and honestly flagging gaps.

🔄 Quick Recall: In the previous lesson, you learned advanced retrieval strategies: hybrid search, reranking, query rewriting, and metadata filtering. Those techniques ensure the right documents reach the LLM. Now you’ll learn how to make the LLM use those documents faithfully.

The Grounding Problem

Even with perfect retrieval, the LLM can still:

Ignore the context and answer from training data
Misinterpret the context and draw wrong conclusions
Blend context with training data creating a mix of sourced and unsourced claims
Over-generalize from specific document statements

The generation prompt is your primary defense against all four.

The Grounding Prompt

Minimal Grounding

<system>
Answer the user's question using ONLY the provided context.
If the context doesn't contain the answer, say "I don't have
information about that in our knowledge base."
Do NOT use your general knowledge to fill gaps.
</system>

<context>
{retrieved_chunks}
</context>

<question>{user_question}</question>

Production Grounding

<system>
You are a knowledge base assistant. Your ONLY source of
information is the context provided below.

Rules:
1. Answer ONLY based on the provided context
2. Cite the source for every factual claim: [Source: filename, page]
3. If the context partially answers the question, answer what
   you can and explicitly state what's missing
4. If the context doesn't address the question at all, respond:
   "I don't have information about that in our knowledge base.
    You might find this in [suggest where to look]."
5. Never start with "Based on the provided context" — just answer
6. If multiple sources conflict, note the discrepancy
</system>

The difference: minimal grounding prevents the worst hallucinations. Production grounding produces trustworthy, citeable, transparent responses.

✅ Quick Check: Your grounding prompt says “Answer based on the provided context.” The LLM responds: “Based on the provided context, the return policy states…” This phrasing is technically correct but sounds robotic. How do you fix it? (Answer: Add rule 5 from the production grounding prompt: “Never start with ‘Based on the provided context’ — just answer naturally.” The response should read: “Our return policy allows returns within 30 days of purchase [Source: return-policy.pdf, p.3].” The citation proves grounding without the awkward preamble.)

Citation Patterns

Inline Citations

Our return policy allows returns within 30 days of purchase
[Source: return-policy.pdf, p.3]. Opened electronics have a
15% restocking fee [Source: electronics-returns.pdf, p.1].

Pros: Every claim is traceable. Users can verify. Cons: Can make text harder to read with many citations.

Footnote Citations

Our return policy allows returns within 30 days of purchase¹.
Opened electronics have a 15% restocking fee².

Sources:
1. return-policy.pdf, page 3
2. electronics-returns.pdf, page 1

Pros: Cleaner reading experience. Cons: Harder for the LLM to maintain footnote accuracy.

Source Block

Our return policy allows returns within 30 days of purchase.
Opened electronics have a 15% restocking fee.

---
Sources used:
- return-policy.pdf (page 3)
- electronics-returns.pdf (page 1)

Pros: Simplest for the LLM to generate. Cons: Doesn’t specify which source supports which claim.

Recommendation: Inline citations for high-stakes use cases (legal, medical, financial). Source block for general Q&A.

Handling Incomplete Information

The hardest generation challenge: what to do when the context doesn’t fully answer the question.

Pattern: Partial Answer with Transparency

User: "What are the health insurance deductibles for 2026?"

Context only contains 2025 deductible information.

Good response:
"Based on our records, the 2025 health insurance deductibles
were $1,500 for individual and $3,000 for family plans
[Source: benefits-2025.pdf, p.12].

I don't have 2026 deductible information in our knowledge base.
For the most current rates, check with HR or visit the benefits
portal at benefits.company.com."

Pattern: Conflicting Sources

User: "How many vacation days do new employees get?"

Context contains two conflicting chunks:
- Employee handbook v3: "15 days per year"
- HR policy update 2025: "20 days per year for new hires"

Good response:
"There's a discrepancy in our records. The employee handbook
(v3) states 15 vacation days [Source: handbook-v3.pdf, p.24],
while a 2025 HR policy update states 20 days for new hires
[Source: hr-update-2025.pdf, p.2].

The 2025 update is more recent and likely reflects current
policy, but I recommend confirming with HR."

✅ Quick Check: The context contains a document from 2023 stating “Our team has 50 employees.” The user asks “How large is our team?” Should the RAG system answer “50 employees”? (Answer: It should answer with a temporal caveat: “As of our 2023 records, the team had 50 employees [Source: team-overview-2023.pdf]. This may not reflect the current team size.” Without the caveat, the user might act on outdated information. Timestamps in metadata enable these caveats automatically.)

Anti-Hallucination Techniques

Technique 1: Temperature Control

Set low temperature (0.0-0.2) for RAG generation. Low temperature makes the LLM more deterministic — it’s more likely to stick to the provided context rather than creatively extrapolate.

Technique 2: Constrained Vocabulary

In the prompt, instruct the LLM to use only terms that appear in the context:

Use the same terminology as the source documents.
Do not paraphrase technical terms or numbers.

Technique 3: Post-Generation Verification

After the LLM generates a response, verify each claim:

For each factual claim in the response:
  → Search for supporting text in the retrieved chunks
  → If found → Keep (verified)
  → If not found → Flag or remove (potential hallucination)

Practice Exercise

Write a grounding prompt for a knowledge base in your domain
Include: citation format, gap handling, conflict handling
Test it mentally: If the context doesn’t answer the question, what does your prompt instruct the LLM to do?

Key Takeaways

Grounding language must be restrictive (“ONLY based on”) not additive (“using”) to prevent training-data contamination
Every factual claim should have a citation — missing citations are a hallucination red flag
Handle incomplete information transparently: answer what you can, explicitly note gaps, suggest where to find more
Handle conflicting sources by noting the discrepancy and recommending the more authoritative or recent source
Low temperature, constrained vocabulary, and post-generation verification reduce hallucination in RAG

Up Next

In the next lesson, you’ll learn to measure and improve RAG quality — using RAGAS metrics, building test suites, and systematically optimizing each stage of the pipeline.

Knowledge Check

1. Your RAG system retrieves 3 relevant document chunks but they don't fully answer the user's question. The LLM fills in the gaps from its training data. Is this acceptable?

Yes — the user gets a complete answer No — the whole point of RAG is grounding in retrieved documents. When the LLM fills gaps from training data, it may introduce outdated or incorrect information that the user trusts because the rest of the answer was sourced. The LLM should clearly state what it found and what it couldn't answer Only if the user explicitly asks for it

2. You design a RAG prompt with this instruction: 'Answer the question using the provided documents.' The LLM sometimes answers correctly and sometimes hallucates. What's wrong with the instruction?

The instruction is fine — hallucination is random The instruction is too weak — it says 'using' (additive) not 'ONLY using' (restrictive). The LLM interprets 'using the documents' as 'the documents are one input among many' rather than 'the documents are the ONLY source of truth.' Stronger grounding: 'Answer ONLY based on the provided documents. Do NOT use your general knowledge' The LLM needs a higher temperature

3. A RAG response says: 'According to our policy, returns are accepted within 30 days [Source: return-policy.pdf, p.3]. Additionally, we offer free return shipping for all orders.' The second sentence has no source citation. Should you be concerned?

No — it's probably true Yes — the unsourced claim ('free return shipping for all orders') may be hallucinated. In a well-designed RAG system, every factual claim should have a citation. Missing citations are a red flag that the LLM generated information beyond what the retrieved documents contain Only if the customer complains

Answer all questions to check

Complete the quiz above first