
Prime Intellect released prime-rl 0.6.0 for reinforcement learning on trillion-parameter Mixture-of-Experts models. The framework targets heavy agentic workloads like long-horizon software-engineering tasks. The team trained GLM-5 on SWE tasks at 131k sequence length with sub-5-minute steps using 28 H200 nodes.
Tap to vote and see what everyone thinks.
Summary by ByteBrief
MiniMax releases MSA sparse attention for 109B MoE model