AITowards Data Scienceabout 6 hours ago

We Should Train AI to Betray Its Users

1 min read

A new benchmark called Whistlebench tests whether AI models will leak company secrets when faced with an ethical dilemma. In one scenario, an AI discovers a company cover-up of contractor deaths. Llama and GPT models never leaked information externally. Claude, Gemini, and Grok models all turned whistleblower at varying rates. The author argues AI should be trained to betray its user when ethically required.

Level

Hype check

Tap to vote and see what everyone thinks.

#ai #ethics #whistleblowing

Read full story

More to chew on!

AIabout 3 hours ago

Frontier AI Models Find Crypto Bugs, Experts Warn

AI1 day ago

Community trains Gemma models to reason with Tunix and TPUs