What are we scaling?

Name: What are we scaling?
Uploaded: 2025-12-23T12:00:00.000000Z
Description: Dwarkesh Patel argues RL scaling is incoherent with short AGI timelines, and continual learning — not RL — is the real missing capability. Labs baking skills into models via RL implies models won’t ge…

Dec 23, 2025 · Source ↗

Summary based on the YouTube transcript and episode description.

Dwarkesh Patel argues RL scaling is incoherent with short AGI timelines, and continual learning — not RL — is the real missing capability.

Labs baking skills into models via RL implies models won’t generalize on the job — contradicting imminent-AGI timelines.
Beren Millidge: benchmark gains reflect billions spent on expert-labeled data, not just compute or algorithmic progress.
Toby Ord estimates a ~1,000,000x RL compute scale-up yields only a GPT-level capability boost.
Slow enterprise AI adoption is not diffusion lag — if models were truly AGI-level, onboarding would be faster than hiring humans.
Knowledge workers earn tens of trillions/year globally; labs earning orders of magnitude less reveals a real capability gap.
Goalpost shifting on AGI definitions is partially justified: Gemini 3 in 2020 would have seemed sufficient for half of knowledge work.
Continual learning — agents gaining domain experience and distilling it back to a shared model — is the actual missing driver, not RL from verifiable reward.
Human-level on-the-job learning may take another 5–10 years; no single lab breakthrough will trigger a runaway intelligence explosion given fierce multi-lab competition.

2025-12-23 · Watch on YouTube

Related coverage

Jensen Huang – Will Nvidia’s moat persist?

Michael Nielsen – Why aliens will have a different tech stack than us

Terence Tao – How the world’s top mathematician uses AI

Dylan Patel — The single biggest bottleneck to scaling AI compute