Our eighth generation TPUs: two chips for the agentic era

https://blog.google/innovation-and-ai/infrastructure-and-cloud/google-cloud/eighth-generation-tpu-agentic-era/

Article

TL;DR

Google’s TPU 8t superpod hits 121 ExaFLOPS with 2PB shared HBM — purpose-built for massive model inference.

Key Takeaways

One TPU 8t pod outcomputes the top 10 supercomputers combined (121 vs 11,487 PetaFLOPS)
Vertical integration gives Google structural cost advantage Nvidia-dependent rivals can’t replicate
Separate training (8t) and inference (8i) chips signal Google optimizing for inference economics at scale

Discussion

Top comments:

[pmb]: Whole-datacenter chip design context is an unbeatable moat for Google at scale
[mlmonkey]: 121 ExaFLOPS per pod dwarfs all top-500 supercomputers combined — context-setting stat
[WarmWash]: Gemini uses far fewer tokens than GPT/Claude — deliberate efficiency or compute constraint?
[jjice]: Google’s vertical stack looked weak until Gemini 2.5 — now it looks like the winning bet

Type	Link
Added	Apr 23, 2026
Modified	Apr 23, 2026
comments	184
hn_id	47862497
score	377
target_url	https://blog.google/innovation-and-ai/infrastructure-and-cloud/google-cloud/eighth-generation-tpu-agentic-era/

🔥 Top Stories 552 items