Why Scale Will Not Solve AGI | Vishal Misra - The a16z Show
Columbia professor Vishal Misra proves transformers do exact Bayesian updating, then argues scale alone cannot reach AGI because LLMs lack plasticity and causal reasoning.
- Transformers match the precise Bayesian posterior to 10^-3 bits accuracy on tasks too large to memorize — proven mathematically, not just empirically.
- Transformer > Mamba > LSTM > MLP in Bayesian updating capability; MLPs fail completely.
- Misra claims he built the first known RAG implementation in October 2020 for ESPN’s cricket stats database using GPT-3 with a custom DSL.
- LLM weights freeze at training; every new conversation starts at zero — no accumulated plasticity unlike human synapses, which remain plastic for life.
- AGI requires two unsolved things: continual learning without catastrophic forgetting, and moving from correlation-based pattern matching to causal simulation.
- LLMs operate in Shannon-entropy space (correlation); AGI requires Kolmogorov-complexity space (shortest causal program) — Einstein’s relativity is a Kolmogorov achievement, unreachable by correlation alone.
- Donald Knuth’s viral LLM math result validates the thesis: LLMs found component solutions via exhaustive Shannon search, but Knuth’s own brain assembled the causal proof.
- Misra rejects consciousness claims about LLMs: their objective is next-token accuracy, not survival or reproduction, so apparent self-preservation behavior is a training-data artifact, not architecture.
2026-03-17 · Watch on YouTube