ChipsTechmemeabout 5 hours ago

Xiaomi claims 1K tokens/sec on 1T-parameter model

1 min read

Xiaomi claims its MiMo-V2.5-Pro-UltraSpeed model achieves over 1,000 tokens per second at the 1-trillion-parameter scale using a standard 8-GPU commodity node. An API trial begins June 9. The claim marks a first for inference speed at that model size.

Level

Hype check

Tap to vote and see what everyone thinks.

#xiaomi #ai #inference

Read full story

More to chew on!

AIabout 14 hours ago

Xiaomi and TileRT Achieve 1200 Tokens Per Second on 1-Trillion-Parameter Model

AIabout 10 hours ago

Xiaomi MiMo 15X Faster Than ChatGPT and Claude