Lesson 2 · 11 min

Specifying an AI feature — the eval-first PM template

The 7-section PM template that makes AI features actually shippable. Eval criteria first, success metric second, edge cases and recovery third.

The template

One-line description. What the user sees. (E.g. "Auto-summarize a support ticket in one sentence.")
Eval criteria. A 30-case eval set with scoring rules. This goes first, not last.
Success metric. What does the team commit to? (E.g. "≥90% of summaries pass the rubric on the eval set; <2% production refusal rate.")
Failure modes + recovery. What happens when the model is wrong? (Refuse, escalate to human, show with caveat.)
Privacy posture. What user data flows to which provider? Retention?
Cost envelope. Per-call cost cap; monthly budget at projected volume.
Lifecycle. Quarterly review trigger; signal that prompts a re-eval (provider model update, eval-set regression, user complaint volume).