Google's experimental DiffusionGemma model generates text by refining a block of tokens simultaneously, similar to image diffusion. The author tested it on an M4 Pro MacBook Pro using a 4-bit GGUF. It caused system-wide slowdowns and did not feel faster than Gemma 4 26B-A4B.
Tap to vote and see what everyone thinks.
Summary by ByteBrief
Google's DiffusionGemma trades quality for speed