Lesson 2 · 11 min

Request patterns: sync, async, streaming, and batching

Choosing the wrong request pattern wastes money or kills UX. Sync, async, streaming, and batch each suit a different latency-cost-throughput trade-off.

The four patterns

Not every LLM call needs an answer in 200ms. Choosing the right request pattern is the single easiest way to cut costs and improve user experience simultaneously.

|---|---|---|---|