Generation: Grounding and Citation
Prompt LLMs to stay grounded in retrieved context, provide source citations, handle incomplete information gracefully, and avoid hallucination in RAG responses.
Premium Course Content
This lesson is part of a premium course. Upgrade to Pro to unlock all premium courses and content.
- Access all premium courses
- 1000+ AI skill templates included
- New content added weekly
Retrieval found the right documents. Now the LLM must use them correctly — grounding its answer in the provided context, citing sources, and honestly flagging gaps.
🔄 Quick Recall: In the previous lesson, you learned advanced retrieval strategies: hybrid search, reranking, query rewriting, and metadata filtering. Those techniques ensure the right documents reach the LLM. Now you’ll learn how to make the LLM use those documents faithfully.
The Grounding Problem
Even with perfect retrieval, the LLM can still:
- Ignore the context and answer from training data
- Misinterpret the context and draw wrong conclusions
- Blend context with training data creating a mix of sourced and unsourced claims
- Over-generalize from specific document statements
The generation prompt is your primary defense against all four.
The Grounding Prompt
Minimal Grounding
<system>
Answer the user's question using ONLY the provided context.
If the context doesn't contain the answer, say "I don't have
information about that in our knowledge base."
Do NOT use your general knowledge to fill gaps.
</system>
<context>
{retrieved_chunks}
</context>
<question>{user_question}</question>
Production Grounding
<system>
You are a knowledge base assistant. Your ONLY source of
information is the context provided below.
Rules:
1. Answer ONLY based on the provided context
2. Cite the source for every factual claim: [Source: filename, page]
3. If the context partially answers the question, answer what
you can and explicitly state what's missing
4. If the context doesn't address the question at all, respond:
"I don't have information about that in our knowledge base.
You might find this in [suggest where to look]."
5. Never start with "Based on the provided context" — just answer
6. If multiple sources conflict, note the discrepancy
</system>
The difference: minimal grounding prevents the worst hallucinations. Production grounding produces trustworthy, citeable, transparent responses.
✅ Quick Check: Your grounding prompt says “Answer based on the provided context.” The LLM responds: “Based on the provided context, the return policy states…” This phrasing is technically correct but sounds robotic. How do you fix it? (Answer: Add rule 5 from the production grounding prompt: “Never start with ‘Based on the provided context’ — just answer naturally.” The response should read: “Our return policy allows returns within 30 days of purchase [Source: return-policy.pdf, p.3].” The citation proves grounding without the awkward preamble.)
Citation Patterns
Inline Citations
Our return policy allows returns within 30 days of purchase
[Source: return-policy.pdf, p.3]. Opened electronics have a
15% restocking fee [Source: electronics-returns.pdf, p.1].
Pros: Every claim is traceable. Users can verify. Cons: Can make text harder to read with many citations.
Footnote Citations
Our return policy allows returns within 30 days of purchase¹.
Opened electronics have a 15% restocking fee².
Sources:
1. return-policy.pdf, page 3
2. electronics-returns.pdf, page 1
Pros: Cleaner reading experience. Cons: Harder for the LLM to maintain footnote accuracy.
Source Block
Our return policy allows returns within 30 days of purchase.
Opened electronics have a 15% restocking fee.
---
Sources used:
- return-policy.pdf (page 3)
- electronics-returns.pdf (page 1)
Pros: Simplest for the LLM to generate. Cons: Doesn’t specify which source supports which claim.
Recommendation: Inline citations for high-stakes use cases (legal, medical, financial). Source block for general Q&A.
Handling Incomplete Information
The hardest generation challenge: what to do when the context doesn’t fully answer the question.
Pattern: Partial Answer with Transparency
User: "What are the health insurance deductibles for 2026?"
Context only contains 2025 deductible information.
Good response:
"Based on our records, the 2025 health insurance deductibles
were $1,500 for individual and $3,000 for family plans
[Source: benefits-2025.pdf, p.12].
I don't have 2026 deductible information in our knowledge base.
For the most current rates, check with HR or visit the benefits
portal at benefits.company.com."
Pattern: Conflicting Sources
User: "How many vacation days do new employees get?"
Context contains two conflicting chunks:
- Employee handbook v3: "15 days per year"
- HR policy update 2025: "20 days per year for new hires"
Good response:
"There's a discrepancy in our records. The employee handbook
(v3) states 15 vacation days [Source: handbook-v3.pdf, p.24],
while a 2025 HR policy update states 20 days for new hires
[Source: hr-update-2025.pdf, p.2].
The 2025 update is more recent and likely reflects current
policy, but I recommend confirming with HR."
✅ Quick Check: The context contains a document from 2023 stating “Our team has 50 employees.” The user asks “How large is our team?” Should the RAG system answer “50 employees”? (Answer: It should answer with a temporal caveat: “As of our 2023 records, the team had 50 employees [Source: team-overview-2023.pdf]. This may not reflect the current team size.” Without the caveat, the user might act on outdated information. Timestamps in metadata enable these caveats automatically.)
Anti-Hallucination Techniques
Technique 1: Temperature Control
Set low temperature (0.0-0.2) for RAG generation. Low temperature makes the LLM more deterministic — it’s more likely to stick to the provided context rather than creatively extrapolate.
Technique 2: Constrained Vocabulary
In the prompt, instruct the LLM to use only terms that appear in the context:
Use the same terminology as the source documents.
Do not paraphrase technical terms or numbers.
Technique 3: Post-Generation Verification
After the LLM generates a response, verify each claim:
For each factual claim in the response:
→ Search for supporting text in the retrieved chunks
→ If found → Keep (verified)
→ If not found → Flag or remove (potential hallucination)
Practice Exercise
- Write a grounding prompt for a knowledge base in your domain
- Include: citation format, gap handling, conflict handling
- Test it mentally: If the context doesn’t answer the question, what does your prompt instruct the LLM to do?
Key Takeaways
- Grounding language must be restrictive (“ONLY based on”) not additive (“using”) to prevent training-data contamination
- Every factual claim should have a citation — missing citations are a hallucination red flag
- Handle incomplete information transparently: answer what you can, explicitly note gaps, suggest where to find more
- Handle conflicting sources by noting the discrepancy and recommending the more authoritative or recent source
- Low temperature, constrained vocabulary, and post-generation verification reduce hallucination in RAG
Up Next
In the next lesson, you’ll learn to measure and improve RAG quality — using RAGAS metrics, building test suites, and systematically optimizing each stage of the pipeline.
Knowledge Check
Complete the quiz above first
Lesson completed!