Claude Opus 4.7 ships with 1M-token context window

Anthropic's flagship reasoning model now ingests 1M tokens — comfortably the entire codebase of most products in one shot. Pricing unchanged.

What changed

The big one: 1M-token context window (up from 200k on Sonnet 4.6, 500k on Opus 4.6). For most teams, that's the entire monorepo, the full corpus of internal docs, or a full-day support transcript window.

What it means in practice

RAG calculus shifts. If your corpus fits in 1M tokens (it probably does for most teams under 50 engineers), the RAG-vs-stuff-the-context decision now leans harder toward "just stuff it" with prompt caching.
Cost discipline matters more. A 1M-token call is expensive. The cached-prefix trick (stable system + corpus first, variable user input last) becomes essential, not optional.
Latency caveat. First-token latency on 1M-token contexts is real — plan for it in user-facing flows.

What to watch

Whether retrieval-quality (does the model actually use what's deep in the context) holds up at 1M scale. Anthropic's "needle in a haystack" eval is good; production traffic with messier prompts will tell us more.