Lesson 2 · 11 min
Specifying an AI feature — the eval-first PM template
The 7-section PM template that makes AI features actually shippable. Eval criteria first, success metric second, edge cases and recovery third.
The template
- One-line description. What the user sees. (E.g. "Auto-summarize a support ticket in one sentence.")
- Eval criteria. A 30-case eval set with scoring rules. This goes first, not last.
- Success metric. What does the team commit to? (E.g. "≥90% of summaries pass the rubric on the eval set; <2% production refusal rate.")
- Failure modes + recovery. What happens when the model is wrong? (Refuse, escalate to human, show with caveat.)
- Privacy posture. What user data flows to which provider? Retention?
- Cost envelope. Per-call cost cap; monthly budget at projected volume.
- Lifecycle. Quarterly review trigger; signal that prompts a re-eval (provider model update, eval-set regression, user complaint volume).