OpenAI Just Released ChatGPT Agent, Its Most Powerful Agent Yet
Watch on YouTube ↗ Summary based on the YouTube transcript and episode description.
OpenAI’s Isa Fulford, Casey Chu, and Edward Sun explain how merging Deep Research and Operator into a single RL-trained agent unlocks hour-long autonomous computer tasks.
- Agent runs tasks up to one hour on a virtual computer with text browser, GUI browser, terminal, and API integrations sharing a single file system state.
- Trained via reinforcement learning across hundreds of thousands of VMs; the model discovers when to use each tool without being explicitly programmed.
- Agent benchmark scores surpass human baseline on data science tasks; spreadsheet and slide generation are highlighted as new breakout capabilities.
- A demo task had the agent autonomously build an OpenAI valuation model with financial projections, a spreadsheet, and a slide deck in 28 minutes.
- Bio-risk was the top safety concern; weeks of red teaming were run, and a real-time monitor flags suspicious trajectories like an antivirus layer.
- Core team was 3–4 researchers on Deep Research side and 6–8 on computer-use agent side — combined for only a few months before shipping.
- RL is highly data-efficient: the fine-tuning dataset is orders of magnitude smaller than pre-training data, enabling fast capability iteration.
- Roadmap priorities: proactive agents that act without user prompting, personalization and memory, and new interaction paradigms beyond chat.
2025-07-22 · Watch on YouTube