OpenAI Codex Team: From Coding Autocomplete to Asynchronous Autonomous Agents
Watch on YouTube ↗ Summary based on the YouTube transcript and episode description.
OpenAI’s Hanson Wang and Alexander Embiricos explain how Codex was RL-tuned beyond competitive coding to ship mergeable PRs autonomously for up to 30 minutes.
- Codex is o3 with additional RL fine-tuning for professional software engineering taste: style, PR descriptions, test hygiene — not just correctness.
- Codex runs in its own cloud container for up to 30 minutes and returns a full PR; users delegate rather than pair.
- Internal OpenAI power users run 10+ PRs per day by adopting an abundance mindset — kick off many tasks in parallel, pick winners.
- Training environments and production environments are identical containers, eliminating the ‘works on my machine’ problem.
- Engineers spend roughly 35% of their time actually writing code; Codex targets that slice, leaving design, planning, and review to humans for now.
- OpenAI intentionally named the internal project WHAM so agents could grep the codebase unambiguously — a concrete tip for agent-addressable codebases.
- Wang predicts the number of professional software developers goes up, not down, as bespoke per-team software becomes practical to build.
- Embiricos’ half-serious future UI for agent collaboration: a TikTok-style vertical feed where founders swipe to approve or reject agent-generated PRs and ideas.
2025-06-10 · Watch on YouTube