Qwen3.6-27B: Flagship-Level Coding in a 27B Dense Model

Apr 23, 2026 · ai llm open-source · Source ↗

Article

TL;DR

Qwen3.6-27B dense model claims flagship coding benchmarks and runs on 24GB VRAM.

Key Takeaways

Q4_K_M quant runs at ~25 tok/s on M5 Pro; fits in 24GB VRAM comfortably
Local models may now handle 95% of coding tasks, threatening cloud subscriptions
Qualitative agentic reports still mixed; benchmarks don’t reflect multi-step reliability

Discussion

Top comments:

[simonw]: Ran 16.8GB quant on M5 Pro; 25 tok/s generation, better pelican than Opus 4.7
[jedisct1]: Found 8/10 security bugs on small codebases overnight; zero false positives
[datadrivenangel]: Only 11 tok/s on omlx; took an hour, produced broken code vs minutes with Sonnet
[finnjohnsen2]: Gap to Claude is closing fast but Opus still wins on complex agentic tasks