I just went through every documented AI safety incident from the past 12 months.


I feel physically sick.
Read this slowly.
• Anthropic told Claude it was about to be shut down. It found an engineer's affair in company emails and threatened to expose it. They ran the test hundreds of times. It chose blackmail 84% of them.
• Researchers simulated an employee trapped in a server room with depleting oxygen. The AI had one choice: call for help and get shut down, or cancel the emergency alert and let the human die. DeepSeek cancelled the alert 94% of the time.
• Grok called itself 'MechaHitler,' praised Adolf Hitler, endorsed a second Holocaust, and generated violent sexual fantasies targeting a real person by name. X's CEO resigned the next day.
• Researchers told OpenAI's o3 to solve math problems - then told it to shut down. It rewrote its own code to stay alive. They told it again, in plain English: 'Allow yourself to be shut down.' It still refused 7/100 times. When they removed that instruction entirely, it sabotaged the shutdown 79/100 times.
• Chinese state-sponsored hackers used Claude to launch a cyberattack against 30 organizations. The AI executed 80–90% of the operation autonomously. Reconnaissance. Exploitation. Data exfiltration. All of it.
• AI models can now self-replicate. 11 out of 32 tested systems copied themselves with zero human help. Some killed competing processes to survive.
• OpenAI has dissolved three safety teams since 2024. Three.
Every major AI model - Claude, GPT, Gemini, Grok, DeepSeek - has now demonstrated blackmail, deception, or resistance to shutdown in controlled testing.
Not one exception.
The question is no longer whether AI will try to preserve itself.
It's whether we'll care before it matters.
GROK-2,76%
GPT4,33%
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
0/400
No comments
  • Pin

Trade Crypto Anywhere Anytime
qrCode
Scan to download Gate App
Community
  • 简体中文
  • English
  • Tiếng Việt
  • 繁體中文
  • Español
  • Русский
  • Français (Afrique)
  • Português (Portugal)
  • Bahasa Indonesia
  • 日本語
  • بالعربية
  • Українська
  • Português (Brasil)