Gemini 3.5 Flash

May 19, 2026 · ai ai-agents coding · Source ↗

TLDR

Google releases Gemini 3.5 Flash at I/O 2026, a fast frontier model targeting agentic workflows, coding, and multimodal tasks.

Outperforms Gemini 3.1 Pro on Terminal-Bench 2.1 (76.2%), GDPval-AA (1656 Elo), and MCP Atlas (83.6%) while being 4x faster in output tokens/second than other frontier models.
Priced at $1.50/$9.00 per million input/output tokens; available via Gemini API, Google AI Studio, Android Studio, Gemini Enterprise, and the Gemini app globally.
Powers Gemini Spark, a 24/7 personal AI agent; rolling out to trusted testers now, Google AI Ultra subscribers in the US next week.
Antigravity harness enables parallel subagent deployment for multi-step workflows; enterprise partners include Shopify, Macquarie Bank, Salesforce, Ramp, Xero, and Databricks.
Gemini 3.5 Pro is in internal testing and expected next month.

Pricing drew sharp criticism: 3.5 Flash at $1.50/$9.00 is a 5x increase over 2.5 Flash, and end-to-end eval costs on artificialanalysis.ai show roughly 9x the total spend of 2.5 Flash, making it more expensive than 2.5 Pro was.
Knowledge cutoff of January 2025 with a May 2026 release date flagged as a significant and concerning gap for time-sensitive agentic use cases.
Quota exhaustion after two Antigravity prompts on the Google AI Pro plan reported as a real usability blocker, not just a benchmark story.

@easygenes: Infers parameter count and architecture from TPU 8i memory/bandwidth specs since Google discloses nothing; estimates point toward ~400B total parameters.
@jl: Full-eval cost comparison shows 3.5 Flash at $1,552 vs 2.5 Flash at $172, a 9x increase driven by both higher per-token price and higher token usage.