
Google released Gemma 4 12B, a model designed for on-device agentic multimodal workflows. It uses an encoder-free architecture that feeds visual and audio data directly into a single decoder-only transformer, bypassing separate encoders. This reduces latency and memory fragmentation, enabling local tasks like data processing and webpage building.
Tap to vote and see what everyone thinks.