Our eighth generation TPUs: two chips for the agentic era
https://blog.google/innovation-and-ai/infrastructure-and-cloud/google-cloud/eighth-generation-tpu-agentic-era/Article
TL;DR
Google’s TPU 8t superpod hits 121 ExaFLOPS with 2PB shared HBM — purpose-built for massive model inference.
Key Takeaways
- One TPU 8t pod outcomputes the top 10 supercomputers combined (121 vs 11,487 PetaFLOPS)
- Vertical integration gives Google structural cost advantage Nvidia-dependent rivals can’t replicate
- Separate training (8t) and inference (8i) chips signal Google optimizing for inference economics at scale
Discussion
Top comments:
- [pmb]: Whole-datacenter chip design context is an unbeatable moat for Google at scale
- [mlmonkey]: 121 ExaFLOPS per pod dwarfs all top-500 supercomputers combined — context-setting stat
- [WarmWash]: Gemini uses far fewer tokens than GPT/Claude — deliberate efficiency or compute constraint?
- [jjice]: Google’s vertical stack looked weak until Gemini 2.5 — now it looks like the winning bet
| Type | Link |
| Added | Apr 23, 2026 |
| Modified | Apr 23, 2026 |
| comments | 184 |
| hn_id | 47862497 |
| score | 377 |
| target_url | https://blog.google/innovation-and-ai/infrastructure-and-cloud/google-cloud/eighth-generation-tpu-agentic-era/ |