Our eighth generation TPUs: two chips for the agentic era
Article
TL;DR
Google’s TPU 8t delivers 121 ExaFlops in a single superpod; TPU 8i handles inference and post-training separately.
Key Takeaways
- Single TPU 8t superpod: 9,600 chips, 2 petabytes shared HBM4, 121 ExaFlops compute
- 2x better performance-per-watt over previous gen; first time training and inference chips split
- Google’s vertical stack integration may make its infrastructure permanently cheaper than any chip vendor
Discussion
Top comments:
- [pmb]: Google’s whole-datacenter design context gives permanent cost efficiency advantage over chip vendors
- [TheMrZZ]: 121 ExaFlops and 2PB shared memory in one pod is a genuine competitive moat
- [WarmWash]: Gemini uses drastically fewer tokens than rivals despite Google having most compute
- [fulafel]: Separate inference vs training chips raises question of whether Nvidia users do the same
- [Keyframe]: Google has been quietly gaining consumer share without infrastructure failures while others stumbled