1 story in the last 7 days
The latest msa news, distilled by AI into sharp ~100-word summaries. ByteBrief tracks msa across dozens of tech sources and brings you only what matters, updated hourly. Tap any story for the full brief, or open the original source.

MiniMax released MiniMax Sparse Attention, a two-branch sparse attention method trained on a 109B-parameter Mixture-of-Experts model using 3T tokens. It splits attention into Index and Main Branches to reduce quadratic softmax cost. The method powers MiniMax-M3, a production model with open-sourced inference kernel.
Summaries by ByteBrief