Our eighth generation TPUs: two chips for the agentic era
Article
TL;DR
Google’s TPU 8t superpod delivers 121 ExaFLOPs across 9,600 chips with 2PB shared HBM.
Key Takeaways
- One pod exceeds top-10 supercomputer combined compute; separate chips for train vs inference
- Google’s vertical stack eliminates the Nvidia tax — structural cost moat vs all rivals
- Gemini generates fewer tokens per task than rivals; either efficiency win or reasoning gap
Discussion
Top comments:
- [TheMrZZ]: 9,600 chips, 2PB shared HBM, double inter-chip bandwidth vs prior gen looks genuinely competitive
- [pmb]: At scale, Google’s whole-datacenter design always wins on cost efficiency over chip vendors
- [WarmWash]: Gemini uses far fewer tokens than Anthropic/OpenAI — mysterious given Google’s compute advantage
- [mlmonkey]: One TPU 8t pod is 10x the compute of the entire top-500 supercomputer list combined