
Google released DiffusionGemma, a text model that generates 256 tokens in parallel using diffusion instead of left-to-right token generation. The model can self-correct its output during generation. This approach keeps GPUs saturated during local inference, where standard models leave GPUs idle most of the time.
Tap to vote and see what everyone thinks.