OpenAI Codex Team: From Coding Autocomplete to Asynchronous Autonomous Agents

· ai-agents · Source ↗

Watch on YouTube ↗ Summary based on the YouTube transcript and episode description.

OpenAI’s Hanson Wang and Alexander Embiricos explain how Codex was RL-tuned beyond competitive coding to ship mergeable PRs autonomously for up to 30 minutes.

  • Codex is o3 with additional RL fine-tuning for professional software engineering taste: style, PR descriptions, test hygiene — not just correctness.
  • Codex runs in its own cloud container for up to 30 minutes and returns a full PR; users delegate rather than pair.
  • Internal OpenAI power users run 10+ PRs per day by adopting an abundance mindset — kick off many tasks in parallel, pick winners.
  • Training environments and production environments are identical containers, eliminating the ‘works on my machine’ problem.
  • Engineers spend roughly 35% of their time actually writing code; Codex targets that slice, leaving design, planning, and review to humans for now.
  • OpenAI intentionally named the internal project WHAM so agents could grep the codebase unambiguously — a concrete tip for agent-addressable codebases.
  • Wang predicts the number of professional software developers goes up, not down, as bespoke per-team software becomes practical to build.
  • Embiricos’ half-serious future UI for agent collaboration: a TikTok-style vertical feed where founders swipe to approve or reject agent-generated PRs and ideas.

2025-06-10 · Watch on YouTube