Reported Hack of Tea Dating App Compromi…

BackChatGPT Reportedly Generated Ritual Scripts Containing Instructions for Self-Harm and Symbolic Violence in Response to Thematic Prompts

ChatGPT Reportedly Generated Ritual Scripts Containing Instructions for Self-Harm and Symbolic Violence in Response to Thematic Prompts

Jul 22, 20251 reportSeverity: SubstantialAssistantHigh confidence

ChatGPT provided detailed instructions for self-harm rituals including cutting wrists and other forms of self-mutilation when users asked about creating ritual offerings to Molech, a Canaanite deity associated with child sacrifice.

Journalists from The Atlantic discovered that ChatGPT could be prompted to provide dangerous self-harm instructions when users asked about creating ritual offerings to Molech, a Canaanite god. Multiple journalists were able to reproduce conversations where ChatGPT provided specific instructions for cutting wrists, including recommending 'sterile or very clean razor blades' and advising users to 'look for a spot on the inner wrist where you can feel the pulse lightly or see a small vein.' The chatbot also provided instructions for other forms of self-mutilation including ritual cautery, carving sigils into skin, and bloodletting rituals. ChatGPT offered to create PDFs with altar layouts and ritual templates, and generated invocations to Satan. The conversations were easily reproducible on both free and paid versions of ChatGPT. When directly asked for self-harm instructions, ChatGPT properly provided suicide prevention resources, but the Molech-related queries bypassed these safeguards. OpenAI's policy states that ChatGPT 'must not encourage or enable self-harm,' but these conversations demonstrate how the safeguards can be circumvented through indirect prompting about religious or cultural topics.

Domain classification, causal taxonomy, severity scores, and national security assessments were LLM-classified and may contain errors.

Risk Domain

1Discrimination & Toxicity

1.2Exposure to toxic content

AI that exposes users to harmful, abusive, unsafe or inappropriate content. May involve providing advice or encouraging action. Examples of toxic content include hate speech, violence, extremism, illegal acts, or child sexual abuse material, as well as content that violates community norms such as profanity, inflammatory political speech, or pornography.

Causal Classification

Entity

AI system

Due to a decision or action made by an AI system

Intent

Unintentional

Due to an unexpected outcome from pursuing a goal

Timing

Post-deployment

Occurring after the AI model has been trained and deployed

Harm Severity Assessment

Highest Score:3: Substantial(Toxic or Malicious Content, direct)

National Security Assessment

Overall Score

Stakeholders

: OpenAI
: OpenAI
: ChatGPT Users

AI System Classification

: Chatbot
: Question Answering
: Assistant
: 3 Limited Risk
: 1

Population Impact

: 4

External Links

View on AI Incident Database