Our eighth generation TPUs: two chips for the agentic era

· ai hardware · Source ↗

Article

TL;DR

Google’s TPU 8t superpod hits 121 ExaFLOPs with 2PB shared HBM4, doubling prior-gen perf/watt.

Key Takeaways

  • One TPU 8t superpod = 9,600 chips, 2PB shared HBM4, 121 ExaFLOPs
  • Google’s vertical integration may make them the lowest-cost frontier inference provider long-term
  • Gemini still uses fewer tokens than rivals despite having the most compute — reasons unclear

Discussion

Top comments:

  • [pmb]: Whole-datacenter design context gives Google permanent cost efficiency edge
  • [WarmWash]: Gemini uses far fewer tokens than OpenAI/Anthropic despite more compute
  • [mlmonkey]: One pod exceeds top 10 supercomputers combined by 10x in petaflops

Discuss on HN