Skip to main content

Lesson 6 · 12 min

Dynamic context assembly

Build a context assembler that retrieves, scores, and deduplicates content at request time — so every prompt contains the most relevant content for that specific query.

Context is not static

The most common mistake in production RAG systems: treating retrieval as a binary include/exclude decision. "Retrieve 5 chunks → include them all → done."

Real-world context assembly is more nuanced:

  • Relevance varies by query — chunk A is highly relevant for question 1, irrelevant for question 2
  • Redundancy wastes tokens — chunk A and chunk B may say the same thing with 80% overlap
  • Recency matters — a 2025 document beats a 2022 document on the same topic
  • Source diversity matters — three chunks from the same source page aren't three independent facts