Xiaomi’s MiMo-V2.5-Pro is now MIT-licensed on HuggingFace, benchmarking within half a point of Claude Opus 4.6 on SWE-Bench Pro at 57.2.
Key Takeaways
Three demos make the capability case: Peking University SysY Rust compiler (4.3 hrs, 233/233 hidden tests), a working video editor (11.5 hrs, 1,868 tool calls, 8,192 lines), and an ngspice analog LDO circuit iterated to spec in ~1 hour.
Self-correction under load is a notable property: during the compiler run a refactoring pass at turn 512 broke two tests; the model diagnosed and recovered without human intervention.
Long-context is architecturally addressed: hybrid Local Sliding Window Attention (128-token window) plus Global Attention layers cuts KV cache storage ~7x; GraphWalks scores are non-zero at 1M tokens where V2-Pro scored zero.
Token efficiency claim: ~70K tokens per ClawEval trajectory vs. an estimated 120K+ for Opus 4.6, Gemini 3.1 Pro, and GPT-5.4 at comparable pass rates – self-reported but meaningful if it holds in production.
Deployment path: 1.02T params FP8, SGLang recommended, vLLM supported; temperature 1.0 / top_p 0.95; works with Claude Code, OpenCode, and Kilo agentic scaffolds out of the box.
Hacker News Comment Review
The single comment confirms the open-source timing: weights had been available for roughly a week before the MIT-licensed HuggingFace drop, suggesting the benchmarks and demos preceded the public release window.
No substantive technical debate has formed yet around inference costs, benchmark methodology, or the self-reported token efficiency figures.