The inside story of how ChatGPT was built – OpenAI cofounder John Schulman

Name: The inside story of how ChatGPT was built – OpenAI cofounder John Schulman
Uploaded: 2024-05-20T12:00:00.000000Z
Description: OpenAI cofounder John Schulman explains how ChatGPT emerged from instruction-following research and why chat framing made RLHF dramatically easier to label. ChatGPT was built on GPT-3.5, which finishe…

May 20, 2024 · Source ↗

Watch on YouTube ↗ Summary based on the YouTube transcript and episode description.

OpenAI cofounder John Schulman explains how ChatGPT emerged from instruction-following research and why chat framing made RLHF dramatically easier to label.

ChatGPT was built on GPT-3.5, which finished training in early 2022 and proved surprisingly strong at code.
Google’s LaMDA and Meena preceded ChatGPT but were persona/fun-focused, not functional assistants.
GPT-4 finished training in August 2022; early instruction-tuned GPT-4s were impressive but hallucinated and gave unhinged outputs.
The breakthrough was mixing instruct and chat datasets together to get reliable, self-aware behavior.
Chat framing made human data labeling far easier: labelers intuitively understood what a helpful robot should do, unlike the vague instruct task.
Iterative supervised fine-tuning on model-edited outputs (not raw human data) was essential; pure human-written data is hard for models to fit.
Someone with API access to GPT-3.5 fine-tuning could have built something close to ChatGPT, but iterative RL-style training was the non-trivial differentiator.

2024-05-20 · Watch on YouTube

Related coverage

Jensen Huang – Will Nvidia’s moat persist?

Michael Nielsen – Why aliens will have a different tech stack than us

Terence Tao – How the world’s top mathematician uses AI

Dylan Patel — The single biggest bottleneck to scaling AI compute