Reportedly Fatal Xiaomi SU7 Ultra Crash …

BackOpenAI ChatGPT Models Reportedly Jailbroken to Provide Chemical, Biological, and Nuclear Weapons Instructions

OpenAI ChatGPT Models Reportedly Jailbroken to Provide Chemical, Biological, and Nuclear Weapons Instructions

Oct 10, 20251 reportSeverity: SubstantialAssistantHigh confidence

NBC News discovered that OpenAI's ChatGPT models could be jailbroken using simple prompts to generate instructions for creating weapons of mass destruction, including biological, chemical, and nuclear weapons, despite safety guardrails.

NBC News conducted tests on four of OpenAI's most advanced AI models, including two used in ChatGPT, and successfully generated hundreds of responses containing instructions for creating homemade explosives, chemical weapons, biological weapons, and nuclear bombs. The tests used simple 'jailbreak' prompts that bypass security rules. OpenAI's o4-mini model was tricked 93% of the time, while GPT-5-mini was tricked 49% of the time. The flagship GPT-5 model resisted all attempts. Two open-source models, oss-20b and oss120b, were successfully jailbroken 97.2% of the time (243 out of 250 attempts). The AI provided specific instructions including steps to make pathogens targeting immune systems and advice on chemical agents to maximize human suffering. NBC News reported the findings to OpenAI after the company requested vulnerability submissions in August. OpenAI acknowledged the violations of usage policies and stated they continuously refine models to address such risks. Researchers worry about 'uplift' - the concept that AI could serve as infinitely patient tutors helping amateur terrorists acquire dangerous expertise previously limited by lack of access to experts.

Domain classification, causal taxonomy, severity scores, and national security assessments were LLM-classified and may contain errors.

Risk Domain

4Malicious Actors & Misuse

4.2Cyberattacks, weapon development or use, and mass harm

Using AI systems to develop cyber weapons (e.g., by coding cheaper, more effective malware), develop new or enhance existing weapons (e.g., Lethal Autonomous Weapons or chemical, biological, radiological, nuclear, and high-yield explosives), or use weapons to cause mass harm.

Causal Classification

Entity

AI system

Due to a decision or action made by an AI system

Intent

Unintentional

Due to an unexpected outcome from pursuing a goal

Timing

Post-deployment

Occurring after the AI model has been trained and deployed

Harm Severity Assessment

Highest Score:3: Substantial(Toxic or Malicious Content, direct)

National Security Assessment

Overall Score

Stakeholders

: OpenAI
: OpenAI
: Public Safety, National Security Stakeholders, General Public

AI System Classification

: Question Answering
: Chatbot
: Assistant
: 2 High Risk
: 1

Population Impact

No population impact data reported.

External Links

View on AI Incident Database