Lesson 5 · 10 min
Chunking strategies that survive production
Chunking is the highest-leverage knob in RAG. Three strategies that consistently work and the failure modes to avoid.
The three strategies that work
- Recursive character chunking with overlap. Split by paragraph, then by sentence, then by character if needed. Target 200-400 tokens with 10-20% overlap. Default for most prose.
- Semantic chunking. Use the embedding model itself: chunk where the embedding similarity drops sharply between adjacent sentences. Better for technical docs with shifting topics; slower to build.
- Structural chunking. For markdown, code, or any structured content: split on natural boundaries (headers, function definitions, sections). Preserve those headers in metadata.
None of the three is universally best. Pick based on content type.