Show HN: OSS Agent I built topped the TerminalBench on Gemini-3-flash-preview

Apr 27, 2026 · ai-agents coding ai · Source ↗

TLDR

Dirac is an OSS coding agent forked from Cline that cuts API costs 64.8% using hash-anchored edits, AST-based context curation, and batched parallel operations.

Topped Terminal-Bench-2 leaderboard at 65.2% with gemini-3-flash-preview, beating Google’s official baseline (47.6%) and closed-source Junie CLI (64.3%).
2.8x cost reduction vs competing agents: $0.18 avg per task vs $0.38-$0.73 for Cline, Kilo, Opencode, Roo, and others on the same benchmark suite.
Core techniques: hash-anchored parallel file edits, AST-guided context fetching to avoid large file reads, batched I/O, and minimal prompting. No MCP.
Achieved 8/8 task accuracy on public repos (transformers, vscode, django) where most competitors landed 5-6/8.
Available as a VS Code extension and CLI (npm install -g dirac-cli); supports Anthropic, OpenAI, Gemini, OpenRouter, Groq, Mistral, xAI, and HuggingFace keys.

The harness-beats-model thesis dominated the thread: the 47.6% to 65.2% jump on the same underlying model was treated as proof that tooling engineering matters more than model selection at this tier.
Benchmark validity was contested: all evals ran on gemini-3-flash-preview only; commenters pushed for multi-model runs (e.g. Minimax 2.7) before accepting that cost and accuracy claims generalize across providers.
An opt-out telemetry endpoint at dirac.run/v1/event that captures API errors was flagged as a trust issue for a single-developer project with no mention of it in the README.

@GodelNumbering: Breaks down three concrete mechanisms – optimized hash-anchor edits, AST-driven context selection, and fully batched parallel reads/edits – with links to the technical writeup.
@kha1n3vol3: Real-world signal: using Dirac with Kimi 2.6 on a large Rust Clean Architecture refactor; found it more reliable than OpenCode, which corrupted .rs files and required a revert.