
Google released DiffusionGemma, an open-weight text diffusion model with 26 billion parameters. It generates 256 tokens in parallel, running up to four times faster than traditional LLMs. On an H100, it exceeds 1,000 tokens per second. The mixture-of-experts architecture activates only 3.8 billion parameters per step.
Tap to vote and see what everyone thinks.