Qwen3.6-27B: Flagship-Level Coding in a 27B Dense Model

Article

Qwen’s new 27B dense model claims Opus-level coding, runs on a single 24GB GPU.

Q4_K_M quant fits in ~16.8GB, runs on M5 Pro at 25 tok/s
Local models closing gap with frontier: Gemma 4 + Qwen 3.6 changed the calculus
Dense beats MoE for VRAM efficiency; no FIM support is a real gap for dev tooling

Top comments:

[simonw]: Ran it on M5 Pro: 25 tok/s, beats Opus 4.7 pelican test

I ran it on an M5 Pro with 128GB of RAM, but it only needs ~20GB of that. I expect it will run OK on a 32GB machine.
[jedisct1]: Security audits overnight: found 8/10 bugs, zero false positives
[jameson]: What moat do Anthropic/OpenAI have when open models approach parity at fraction of cost?
[2001zhaozhao]: Local hardware running 24/7 small models could make autonomous agent workflows viable cheaply