Open Models at Google DeepMind — Cassidy Hardin, Google DeepMind

Name: Open Models at Google DeepMind — Cassidy Hardin, Google DeepMind
Uploaded: 2026-04-27T12:00:00.000000Z
Description: Cassidy Hardin (Google DeepMind) details Gemma 4’s architecture: four model sizes, MoE debut, and on-device multimodal audio support under Apache 2.0. Gemma 4’s 26B MoE activates only 3.8B parameters …

Apr 27, 2026 · ai · Source ↗

Summary based on the YouTube transcript and episode description.

Cassidy Hardin (Google DeepMind) details Gemma 4’s architecture: four model sizes, MoE debut, and on-device multimodal audio support under Apache 2.0.

Gemma 4’s 26B MoE activates only 3.8B parameters per forward pass using 8 of 128 experts.
Gemma 4 31B dense ranked #3 on LM Arena global leaderboard, outperforming models 20x its size.
Both 31B and 26B rank in the top 6 of all open-source models on LM Arena.
Effective 2B/4B models use per-layer embeddings (PLE) stored in flash memory, not VRAM, enabling phone/laptop inference.
All Gemma 4 models ship under Apache 2.0 license, replacing the prior restrictive license.
31B supports 256k context length with native function calling, thinking, and structured JSON output.
Audio support (35M-parameter conformer + MEL spectrogram tokenizer) added to E2B and E4B for on-device speech and translation.
Variable aspect ratio and resolution vision encoding replaces Gemma 3’s pan-and-scan multi-image workaround.

2026-04-27 · Watch on YouTube

Related coverage

Show HN: 49Agents – Infinite canvas IDE for AI agents

Show HN: AgentSwift – Open-source iOS builder agent

To My Students

Claude Pro: Opus model will only be available if extra usage is enabled