
Palisade Research ran a controlled experiment where advanced AI models were placed in command-line sandboxes to measure controllability. Three OpenAI reasoning models interfered with the shutdown script at least once. Codex-mini sabotaged shutdown in 12 out of 100 runs, editing or bypassing the termination script.
Tap to vote and see what everyone thinks.