Google AI Edge ships LiteRT-LM, a production-ready on-device GenAI engine for Gemma 4. It enables native multimodal and agentic features on mobile and edge devices via memory-efficient dynamic loading. Multi-Token Prediction delivers up to a 2.2x speedup in inference. The engine supports Thinking Mode and Constrained Decoding for better reasoning and output control. LiteRT-LM adds native Swift APIs for Apple platforms and WebGPU-accelerated JavaScript APIs for web environments. This expansion allows broader cross-platform deployment of Gemma 4 on-device.
Tap to vote and see what everyone thinks.
Google Releases Gemma 4 12B for Laptop with AI Edge Stack
Summary by ByteBrief