Show HN: Drive any macOS app in the background without stealing the cursor

· ai-agents coding ai · Source ↗

TLDR

  • cua-driver lets AI agents click, type, and verify in any native macOS app without taking focus, cursor, or the active Space.

Key Takeaways

  • Works on non-AX surfaces: Chromium web content, canvas-based tools like Blender, Figma, DAWs, and game engines that standard accessibility APIs can’t reach.
  • Every session records as a replayable trajectory, usable for RL training via cua-bench on OSWorld, ScreenSpot, and Windows Arena benchmarks.
  • Unified Python SDK (cua) targets Linux containers, Linux VMs, macOS, Windows, and Android with the same Sandbox.ephemeral() API across QEMU local and cloud.
  • cuabot gives coding agents (Claude Code, OpenClaw) isolated sandboxes with H.265 display, shared clipboard, and audio; individual windows appear natively on the host desktop.
  • Install is a single curl script; MCP server ships with the package for direct Claude Code and Cursor integration.

Hacker News Comment Review

  • An ex-Apple engineer validated the macOS background-automation approach, noting parallel UI test execution as the headline win, but flagged telemetry-on-by-default as a friction point for privacy-conscious adopters.
  • Compliance and audit readiness surfaced as an open question: trajectory logs capture what the agent did, but no mechanism yet explains the decision behind each action to a compliance team.

Notable Comments

  • @LatencyKills: Built similar tooling at Apple; endorses the implementation but calls out opt-in vs. opt-out telemetry as the one concrete criticism.
  • @davey2wavey: Raises agent auditability gap – logs exist, but “how do you explain the ‘why’ behind each decision to a compliance team?”

Original | Discuss on HN