Where they ship
Mid-2026 the four production-leaning options are: Anthropic Computer Use (Claude with screen+mouse+keyboard tool), OpenAI Operator, Browser Use (open-source library on top of Playwright + frontier model), and Highway (open-source desktop-only).
What works on all of them:
- Repetitive workflows on stable surfaces — quarterly export from a known admin panel, scripted form-filling on a government website, regression QA on your own product.
- Pixel-grounding on common SaaS (Google Docs, Salesforce, GitHub) — the model can find the Save button reliably.
- Two-step decompose-then-act — the model writes a plan first, then executes step by step with explicit checkpoints.
What doesn't work yet:
- Open-ended adversarial workflows (book me arbitrary travel across any travel website).
- Bespoke internal tools at non-standard zoom levels — pixel grounding degrades fast.
- Tasks requiring sustained hours of attention — accuracy drifts; cost adds up.
What to actually do
If your product has a workflow that's currently a Zapier/iPaaS integration, computer-use can sometimes replace it more cheaply. If it's a bespoke flow on a stable internal admin panel, it's a strong fit. Don't expect a general-purpose office assistant — that's still aspirational.