An author set up a local coding agent on macOS using Gemma 4 with Multi-Token Prediction (MTP) after internet failures stranded them without cloud agents. On an Apple M1 Max with 64 GB memory, the model achieved 72.2 tokens/second with 3 draft tokens, up from 58 tokens/second without MTP.
Tap to vote and see what everyone thinks.