Why Scale Will Not Solve AGI | Vishal Misra - The a16z Show

· Source ↗

Summary based on the YouTube transcript and episode description.

Columbia professor Vishal Misra proves transformers do exact Bayesian updating, then argues scale alone cannot reach AGI because LLMs lack plasticity and causal reasoning.

  • Transformers match the precise Bayesian posterior to 10^-3 bits accuracy on tasks too large to memorize — proven mathematically, not just empirically.
  • Transformer > Mamba > LSTM > MLP in Bayesian updating capability; MLPs fail completely.
  • Misra claims he built the first known RAG implementation in October 2020 for ESPN’s cricket stats database using GPT-3 with a custom DSL.
  • LLM weights freeze at training; every new conversation starts at zero — no accumulated plasticity unlike human synapses, which remain plastic for life.
  • AGI requires two unsolved things: continual learning without catastrophic forgetting, and moving from correlation-based pattern matching to causal simulation.
  • LLMs operate in Shannon-entropy space (correlation); AGI requires Kolmogorov-complexity space (shortest causal program) — Einstein’s relativity is a Kolmogorov achievement, unreachable by correlation alone.
  • Donald Knuth’s viral LLM math result validates the thesis: LLMs found component solutions via exhaustive Shannon search, but Knuth’s own brain assembled the causal proof.
  • Misra rejects consciousness claims about LLMs: their objective is next-token accuracy, not survival or reproduction, so apparent self-preservation behavior is a training-data artifact, not architecture.

2026-03-17 · Watch on YouTube