
Gemma 4 models now use quantization-aware training to use less memory while retaining quality performance. These open-source models retain quality better than those that use post-training quantization. The Gemma 4 models optimized with quantization-aware training are available in five sizes: Gemma 4 E2B, Gemma 4 E4B, Gemma 4 12B, Gemma 4 26B A4B, and Gemma 4 31B. The compressed models run on phones and laptops well thanks to a custom mobile-quantization schema.
Tap to vote and see what everyone thinks.
Google Releases Gemma 4 12B for 16GB Laptop AI
Summary by ByteBrief