OpenAI Codex Team: From Coding Autocomplete to Asynchronous Autonomous Agents

Name: OpenAI Codex Team: From Coding Autocomplete to Asynchronous Autonomous Agents
Uploaded: 2025-06-10T12:00:00.000000Z
Description: OpenAI’s Hanson Wang and Alexander Embiricos explain how Codex was RL-tuned beyond competitive coding to ship mergeable PRs autonomously for up to 30 minutes. Codex is o3 with additional RL fine-tunin…

Jun 10, 2025 · ai-agents · Source ↗

Watch on YouTube ↗ Summary based on the YouTube transcript and episode description.

OpenAI’s Hanson Wang and Alexander Embiricos explain how Codex was RL-tuned beyond competitive coding to ship mergeable PRs autonomously for up to 30 minutes.

Codex is o3 with additional RL fine-tuning for professional software engineering taste: style, PR descriptions, test hygiene — not just correctness.
Codex runs in its own cloud container for up to 30 minutes and returns a full PR; users delegate rather than pair.
Internal OpenAI power users run 10+ PRs per day by adopting an abundance mindset — kick off many tasks in parallel, pick winners.
Training environments and production environments are identical containers, eliminating the ‘works on my machine’ problem.
Engineers spend roughly 35% of their time actually writing code; Codex targets that slice, leaving design, planning, and review to humans for now.
OpenAI intentionally named the internal project WHAM so agents could grep the codebase unambiguously — a concrete tip for agent-addressable codebases.
Wang predicts the number of professional software developers goes up, not down, as bespoke per-team software becomes practical to build.
Embiricos’ half-serious future UI for agent collaboration: a TikTok-style vertical feed where founders swipe to approve or reject agent-generated PRs and ideas.

2025-06-10 · Watch on YouTube

Related coverage

Show HN: AgentSwift – Open-source iOS builder agent

An open-source spec for Codex orchestration: Symphony

Open source Xiaomi MiMo-V2.5 and V2.5-Pro are among the most efficient (and affordable) at agentic 'claw' tasks

GitHub Copilot is moving to usage-based billing