Mark Zuckerberg — Llama 3, $10B models, Caesar Augustus, & 1 GW datacenters

· ai · Source ↗

Watch on YouTube ↗ Summary based on the YouTube transcript and episode description.

Mark Zuckerberg explains why Meta will open-source even a $10B model and why concentrated AI is more dangerous than widespread AI.

  • Llama 3 405B is still training; at current checkpoint it scores ~85 MMLU, expected to lead benchmarks on release.
  • The 8B Llama 3 model is nearly as capable as the largest Llama 2 model released.
  • Meta trained Llama 3 70B on ~15 trillion tokens and the model was still improving at the end — no clear saturation.
  • No single gigawatt data center has been built yet; Zuckerberg sees energy permitting, not capital, as the next hard bottleneck for scaling.
  • Zuckerberg argues a single actor holding vastly superior AI is a bigger risk than open-source proliferation — explicitly includes adversarial governments and untrustworthy companies.
  • Meta’s custom silicon currently handles inference for ranking and recommendations (Reels, Feed, Ads), freeing Nvidia GPUs solely for training.
  • Meta has revenue-share deals with major cloud providers (Azure, AWS, etc.) to host Llama 2; expects this to grow with larger models.
  • Zuckerberg draws a parallel between Augustus redefining peace as positive-sum and open source as a model most investors still cannot conceptualize as genuinely valuable.

2024-04-18 · Watch on YouTube