How OpenAI Built its Groundbreaking Deep Research Product ft. Isa Fulford

· Source ↗

Watch on YouTube ↗ Summary based on the YouTube transcript and episode description.

Isa Fulford explains how OpenAI built Deep Research using RL-trained browsing and code-execution tools on top of a fine-tuned o3 model.

  • Deep Research is a fine-tuned version of o3 trained specifically for web browsing and data analysis via reinforcement learning.
  • The team started with a hacked demo using plain prompting — no model training — just to get internal buy-in before any RL work began.
  • Training required building two core tools: a browser (search, click, scroll) and a Python code executor for data analysis and graphing.
  • o3’s search capability is a direct downstream benefit of the same tools and browsing datasets developed for Deep Research.
  • Deep Research asks clarifying questions before starting because it runs for 5–30 minutes and OpenAI wants users to front-load specificity.
  • Roadmap priorities: reduce hallucinations, integrate private/internal company knowledge, and shift from synthesis to taking real-world actions.
  • Initial RL training focused on math, science, and coding; the team hypothesized that training directly on everyday browsing tasks would generalize better to user needs.

2025-05-08 · Watch on YouTube