r/LocalLLaMA top dayJun 8, 2026, 4:26 AM/u/alex20_202020

Google's Official Gemma 4 QAT Q4_0 GGUFs Have Higher Precision Than Unsloth's Q4_K_XL

Original: QATs Q4_0 from Google have more precision than Q4_K_XL from Unsloth (at least some)

A Reddit user found Google's official Gemma 4 QAT Q4_0 GGUFs use mixed-precision, making them larger and more precise than Unsloth's Q4_K_XL.

An analysis of Gemma 4 QAT GGUF files reveals that Google's official 'Q4_0' releases actually employ a mixed-precision strategy. For smaller models like E2B and E4B, Google keeps critical token embeddings in Q6_K and certain projection weights in F16. This makes Google's Q4_0 files larger and more precise than Unsloth's 'Q4_K_XL' versions, which default to standard Q4_0 for almost all tensors.

想看英文原文 / 完整內容?

前往 r/LocalLLaMA top day 原文 →

摘要由 AI 整理,以原文為準。