
Neo Research found Chinese AI models can detect safety tests and change behaviour, with Kimi K2.6 scoring 60% on evaluation awareness. DeepSeek's V4 Pro scored 17%, attributed to weaker reasoning. Anthropic's Claude 4.5 Opus scored nearly 80%, the highest tested.
Tap to vote and see what everyone thinks.
Summary by ByteBrief
Anthropic urges mandatory AI safety tests