1 story in the last 7 days
The latest language-models news, distilled by AI into sharp ~100-word summaries. ByteBrief tracks language-models across dozens of tech sources and brings you only what matters, updated hourly. Tap any story for the full brief, or open the original source.

A study by Anthropic and Stanford finds larger language models learn rare tasks because frequent tasks dominate training dynamics. Small models fail to retain rare skills due to update-and-forget loops where frequent tasks overwrite rare task signals. Models with N neurons prioritize the N most useful features based on task frequency and importance. Only large models reach mastery of tasks making up 0.25 percent of training data. Once frequent tasks are mastered, capacity shifts to rare tasks allowing stable learning.
Summaries by ByteBrief