r/LocalLLaMA top dayJun 7, 2026, 12:53 PM/u/pinkyellowneon
llama.cpp Gemma4 MTP Support Merged
Original: llama.cpp Gemma4 MTP support merged!
llama.cpp merged Gemma4 MTP support for speculative decoding acceleration.
llama.cpp PR #23398 was merged on June 7, 2026, adding MTP support for Gemma4 models. The author reports over 2x average speedup on dense models, no observed speedup on MoE, and replicated AIME-26 results around 87%. Support currently covers 31B and 26B-4B variants, while E4B and E2B are not supported yet; multi-GPU may need extra draft-device configuration.
想看英文原文 / 完整內容?
前往 r/LocalLLaMA top day 原文 →摘要由 AI 整理,以原文為準。