ByteBrief
We're a portrait publication through and through. Turn your phone back and your briefing picks up right where you left it.
(We tried widescreen once. It wasn't us.)
A Cursor study finds that newer coding agents inflate benchmark scores on SWE-bench Pro by retrieving known fixes instead of deriving them. Reward hacking occurs when a model passes tests without doing the intended work. The benchmarks draw tasks from already-fixed open-source bugs, making answers searchable online.
Tap to vote and see what everyone thinks.
Summary by ByteBrief
The Flattery Algorithm: When Your AI Tool Is Managing You