How Scaling Laws Will Determine AI's Future | YC Decoded

· ai · Source ↗

Watch on YouTube ↗ Summary based on the YouTube transcript and episode description.

Garry Tan breaks down the scaling laws debate: pre-training may be plateauing, but test-time compute via o1/o3 opens a new scaling frontier.

  • AI performance has been doubling every ~6 months vs. every 18 months for Moore’s Law.
  • OpenAI’s 2020 scaling laws paper showed smooth power-law improvement when scaling parameters, data, and compute together.
  • Google DeepMind’s Chinchilla (2022) showed GPT-3-class models were undertrained — a model half GPT-3’s size but 4x more data outperformed models 2-3x larger.
  • Current frontier labs report GPU counts scaling at the same rate but intelligence improvements stalling, with rumors of failed training runs at major labs.
  • Data scarcity is an emerging bottleneck; some researchers argue high-quality training data may run out sooner than expected.
  • OpenAI’s o3 smashed benchmarks in software engineering, math, and PhD-level science — representing a qualitative leap, not incremental gain.
  • Test-time compute (scaling Chain of Thought inference) is the new scaling paradigm: longer thinking = better performance, unconstrained by training data limits.
  • Scaling laws extend beyond LLMs to image diffusion, protein folding, chemical models, and robotics world models — still early innings.

2025-01-23 · Watch on YouTube