We Upgraded to a Frontier Model and Our Costs Went Down

· ai · Source ↗

TLDR

  • Running Opus 4.6 costs less than Sonnet 4.0 because a Haiku triager filters 80% of CI failures before they reach the frontier model.

Key Takeaways

  • Haiku triager with pgvector semantic search screens duplicates; 4 of 5 CI failures never reach Opus, at 25x lower cost per triage.
  • Opus never reads raw logs; it queries ClickHouse via SQL and spawns scoped Haiku sub-agents capped one level deep to prevent runaway costs.
  • Haiku handles 65% of input tokens but only 36% of LLM spend; removing the model hierarchy more than doubles the daily bill.
  • Context hygiene keeps Opus clean: sub-agents return structured summaries, not raw output; each sub-agent starts fresh and its context is discarded after.
  • The triager/orchestrator pattern generalizes to security logs, IoT telemetry, and financial data wherever most events are noise or known repeats.

Hacker News Comment Review

  • The sole commenter flagged the title as misleading clickbait; the actual insight is the Haiku-triager-plus-Opus-orchestrator architecture, not the counterintuitive cost headline.

Original | Discuss on HN