The numbers
From 250 teams shipping production RAG (2026 community survey):
- 73% use vector-only retrieval (cosine similarity). Mostly OpenAI embeddings.
- 12% use hybrid (BM25 + vector) with a reranker.
- 9% use hybrid without rerank.
- 6% use BM25 only for specific use cases (heavy on identifiers, error codes).
Most teams reported "vector cosine works fine" but only 18% had run a side-by-side comparison.
The catch
In the smaller cohort that did compare (n=46), hybrid + rerank produced a median +14 percentage-point improvement on Recall@5 vs vector-only. The improvement was largest on technical content (error codes, product names, version numbers).
If your corpus has structured identifiers — almost all B2B SaaS docs do — you should run the comparison. The single biggest "free upgrade" in production RAG remains hybrid + rerank.