Our eighth generation TPUs: two chips for the agentic era

· ai hardware · Source ↗

Article

TL;DR

Google’s TPU 8t superpod delivers 121 ExaFLOPs across 9,600 chips with 2PB shared HBM.

Key Takeaways

  • One pod exceeds top-10 supercomputer combined compute; separate chips for train vs inference
  • Google’s vertical stack eliminates the Nvidia tax — structural cost moat vs all rivals
  • Gemini generates fewer tokens per task than rivals; either efficiency win or reasoning gap

Discussion

Top comments:

  • [TheMrZZ]: 9,600 chips, 2PB shared HBM, double inter-chip bandwidth vs prior gen looks genuinely competitive
  • [pmb]: At scale, Google’s whole-datacenter design always wins on cost efficiency over chip vendors
  • [WarmWash]: Gemini uses far fewer tokens than Anthropic/OpenAI — mysterious given Google’s compute advantage
  • [mlmonkey]: One TPU 8t pod is 10x the compute of the entire top-500 supercomputer list combined

Discuss on HN