Computer-use agents survey — what they actually do well in mid-2026

Anthropic, OpenAI, and several open-source projects all ship 'agent operates a UI' models now. The capability map is sharper than the marketing.

Where they ship

Mid-2026 the four production-leaning options are: Anthropic Computer Use (Claude with screen+mouse+keyboard tool), OpenAI Operator, Browser Use (open-source library on top of Playwright + frontier model), and Highway (open-source desktop-only).

What works on all of them:

Repetitive workflows on stable surfaces — quarterly export from a known admin panel, scripted form-filling on a government website, regression QA on your own product.
Pixel-grounding on common SaaS (Google Docs, Salesforce, GitHub) — the model can find the Save button reliably.
Two-step decompose-then-act — the model writes a plan first, then executes step by step with explicit checkpoints.

What doesn't work yet:

Open-ended adversarial workflows (book me arbitrary travel across any travel website).
Bespoke internal tools at non-standard zoom levels — pixel grounding degrades fast.
Tasks requiring sustained hours of attention — accuracy drifts; cost adds up.

What to actually do

If your product has a workflow that's currently a Zapier/iPaaS integration, computer-use can sometimes replace it more cheaply. If it's a bespoke flow on a stable internal admin panel, it's a strong fit. Don't expect a general-purpose office assistant — that's still aspirational.

Want the deep dive?

The lessons that ground this news in mechanics — not opinion.

Browse courses