Our eighth generation TPUs: two chips for the agentic era

https://blog.google/innovation-and-ai/infrastructure-and-cloud/google-cloud/eighth-generation-tpu-agentic-era/

Article

TL;DR

Google’s TPU 8t superpod hits 121 ExaFLOPS with 2PB shared HBM — purpose-built for massive model inference.

Key Takeaways

  • One TPU 8t pod outcomputes the top 10 supercomputers combined (121 vs 11,487 PetaFLOPS)
  • Vertical integration gives Google structural cost advantage Nvidia-dependent rivals can’t replicate
  • Separate training (8t) and inference (8i) chips signal Google optimizing for inference economics at scale

Discussion

Top comments:

  • [pmb]: Whole-datacenter chip design context is an unbeatable moat for Google at scale
  • [mlmonkey]: 121 ExaFLOPS per pod dwarfs all top-500 supercomputers combined — context-setting stat
  • [WarmWash]: Gemini uses far fewer tokens than GPT/Claude — deliberate efficiency or compute constraint?
  • [jjice]: Google’s vertical stack looked weak until Gemini 2.5 — now it looks like the winning bet

Discuss on HN


Type Link
Added Apr 23, 2026
Modified Apr 23, 2026
comments 184
hn_id 47862497
score 377
target_url https://blog.google/innovation-and-ai/infrastructure-and-cloud/google-cloud/eighth-generation-tpu-agentic-era/