TLDR
-
Laguna ships M.1 (225B-A23B MoE) and open-weight XS.2 (33B-A3B, Apache 2.0) for agentic coding, with an ACP agent runtime released alongside.
Key Takeaways
-
XS.2 reaches 44.5% SWE-bench Pro and 68.2% SWE-bench Verified at 33B total / 3B activated params; weights free under Apache 2.0.
-
M.1 (225B-A23B) trained from scratch on 30T tokens across 6,144 NVIDIA Hopper GPUs; scores 46.9% SWE-bench Pro, 40.7% Terminal-Bench 2.0.
-
The ACP server (agent harness) is the same runtime used for RL training and evaluation, released to close the model-to-agent gap.
-
AutoMixer trains ~60 proxy models per run to optimize pre-training data mix; delivered targeted gains on code and math over manual ablations.
-
Synthetic data is 13% of XS.2 pre-training; total Laguna family consumed 4.4T+ synthetic tokens across pre-training stages.
Hacker News Comment Review
-
Early testers via the “pool” agent report fast responses and stronger ACP spec adherence than Codex or opencode; works well in Zed today.
-
Laguna’s own benchmark table shows Qwen3.6 35B outperforming M.1 225B on Terminal-Bench 2.0 and SWE-bench Pro, flagged as a notable efficiency gap.
-
Color-coded benchmark charts criticized as visually polished but difficult to parse; signal extraction requires effort despite the clean design.
Notable Comments
-
@franksiem: Long-time observer who expected perpetual stealth; sees the release as proof it “materialized into something competitive.”
Original | Discuss on HN