Mark Zuckerberg — Llama 3, $10B models, Caesar Augustus, & 1 GW datacenters
Watch on YouTube ↗ Summary based on the YouTube transcript and episode description.
Mark Zuckerberg explains why Meta will open-source even a $10B model and why concentrated AI is more dangerous than widespread AI.
- Llama 3 405B is still training; at current checkpoint it scores ~85 MMLU, expected to lead benchmarks on release.
- The 8B Llama 3 model is nearly as capable as the largest Llama 2 model released.
- Meta trained Llama 3 70B on ~15 trillion tokens and the model was still improving at the end — no clear saturation.
- No single gigawatt data center has been built yet; Zuckerberg sees energy permitting, not capital, as the next hard bottleneck for scaling.
- Zuckerberg argues a single actor holding vastly superior AI is a bigger risk than open-source proliferation — explicitly includes adversarial governments and untrustworthy companies.
- Meta’s custom silicon currently handles inference for ranking and recommendations (Reels, Feed, Ads), freeing Nvidia GPUs solely for training.
- Meta has revenue-share deals with major cloud providers (Azure, AWS, etc.) to host Llama 2; expects this to grow with larger models.
- Zuckerberg draws a parallel between Augustus redefining peace as positive-sum and open source as a model most investors still cannot conceptualize as genuinely valuable.
2024-04-18 · Watch on YouTube