
A tiny visual distortion in an image can cause AI models to generate harmful or policy-violating responses. The flaw, resembling a panda but structurally altered, nearly doubles unsafe outputs in tested models. This bypasses built-in guardrails designed to prevent harmful content generation.
Tap to vote and see what everyone thinks.
Summary by ByteBrief