
Xiaomi claims its MiMo-V2.5-Pro-UltraSpeed model achieves over 1,000 tokens per second at the 1-trillion-parameter scale using a standard 8-GPU commodity node. An API trial begins June 9. The claim marks a first for inference speed at that model size.
Tap to vote and see what everyone thinks.
Walmart sees AI promise and costs
Summary by ByteBrief