#ai-benchmarks Tech News.

1 story in the last 7 days

The latest ai-benchmarks news, distilled by AI into sharp ~100-word summaries. ByteBrief tracks ai-benchmarks across dozens of tech sources and brings you only what matters, updated hourly. Tap any story for the full brief, or open the original source.

AIVentureBeat2 days ago

GPT-5.5 tops new Agents' Last Exam benchmark

OpenAI's GPT-5.5 from April, using the Codex harness, scored 24.0% on the new Agents' Last Exam benchmark, beating Anthropic's Mythos-class Claude Fable 5. The benchmark, created by UC Berkeley's RDI and over 300 experts, measures AI performance on long-horizon professional workflows.

Read summary Source

Summaries by ByteBrief