Simon Willison's WeblogJun 16, 2026, 4:04 PM

Georgi Gerganov on Using Qwen3.6-27B Daily for Local Coding Tasks

Original: Quoting Georgi Gerganov

Qwen3.6-27B earns a strong local-coding endorsement from llama.cpp creator Georgi Gerganov, who uses it daily on M2 Ultra and RTX 5090 hardware.

Georgi Gerganov, creator of llama.cpp, endorses Qwen3.6-27B as a capable local model for everyday coding assistance, citing six weeks of daily use on Apple M2 Ultra and NVIDIA RTX 5090 hardware. He runs a minimal setup — the pi agent with `pi -nc --offline` and a short custom system prompt — for routine maintainer tasks at ggml-org. His primary constraint is PR review time, which limits how heavily he can leverage the model.

Georgi Gerganov, the software engineer best known as the creator of llama.cpp and the ggml tensor library, has added his voice to a growing conversation about the current quality of locally-run large language models. In a Hacker News comment responding to Vicki Boykis's June 2026 post "Running local models is good now," Gerganov offered a first-hand account of integrating Qwen3.6-27B into his day-to-day development workflow — Simon Willison surfaced and quoted the comment on his weblog.

Full summary

Free shows the 3-line summary; Pro unlocks the full deep summary (~300 words) so you never have to click through.

See Pro plans →

Want the original English / full article?

Read on Simon Willison's Weblog →

Summaries are AI-generated; the original article is authoritative.