Our eighth generation TPUs: two chips for the agentic era

· ai hardware · Source ↗

Article

TL;DR

Google’s TPU 8t delivers 121 ExaFlops in a single superpod; TPU 8i handles inference and post-training separately.

Key Takeaways

  • Single TPU 8t superpod: 9,600 chips, 2 petabytes shared HBM4, 121 ExaFlops compute
  • 2x better performance-per-watt over previous gen; first time training and inference chips split
  • Google’s vertical stack integration may make its infrastructure permanently cheaper than any chip vendor

Discussion

Top comments:

  • [pmb]: Google’s whole-datacenter design context gives permanent cost efficiency advantage over chip vendors
  • [TheMrZZ]: 121 ExaFlops and 2PB shared memory in one pod is a genuine competitive moat
  • [WarmWash]: Gemini uses drastically fewer tokens than rivals despite Google having most compute
  • [fulafel]: Separate inference vs training chips raises question of whether Nvidia users do the same
  • [Keyframe]: Google has been quietly gaining consumer share without infrastructure failures while others stumbled

Discuss on HN