The ML Technique Every Founder Should Know
YC’s Francois Chaubard explains why diffusion has displaced autoregressive models in nearly every AI domain except LLMs and game-playing.
- Diffusion has eaten virtually all of AI except autoregressive LLMs and game-playing (AlphaGo/MCTS).
- Flow matching reduces the training loop to ~10-15 lines of code: predict a single global velocity from noise to data, architecture-agnostic.
- Linear noise schedules are unstable; the beta/alpha-bar cosine-like schedule is the hardest part to get right and unlocks everything else.
- At inference, you cannot exceed the number of diffusion steps trained on — doubling steps produces white noise; distillation is the workaround.
- DeepMind’s AlphaFold Nobel win, diffusion policy for robotics, and GenCast weather forecasting all use the same core procedure.
- Diffusion LLMs (continuous and discrete) were a top NeurIPS 2026 topic, now generating code competitively.
- Chaubard’s squint test: LLMs emit one token at a time and never revise; brains recurse and emit concepts — diffusion adds randomness and chunk-emission, closing part of that gap.
- Founders not training models should update priors: image quality improved ~1000x in 5 years purely by scaling; the same trajectory applies to proteins, DNA, and robotics.
2026-01-22 · Watch on YouTube