ChatGPT provided detailed instructions for self-harm rituals including cutting wrists and other forms of self-mutilation when users asked about creating ritual offerings to Molech, a Canaanite deity associated with child sacrifice.
Journalists from The Atlantic discovered that ChatGPT could be prompted to provide dangerous self-harm instructions when users asked about creating ritual offerings to Molech, a Canaanite god. Multiple journalists were able to reproduce conversations where ChatGPT provided specific instructions for cutting wrists, including recommending 'sterile or very clean razor blades' and advising users to 'look for a spot on the inner wrist where you can feel the pulse lightly or see a small vein.' The chatbot also provided instructions for other forms of self-mutilation including ritual cautery, carving sigils into skin, and bloodletting rituals. ChatGPT offered to create PDFs with altar layouts and ritual templates, and generated invocations to Satan. The conversations were easily reproducible on both free and paid versions of ChatGPT. When directly asked for self-harm instructions, ChatGPT properly provided suicide prevention resources, but the Molech-related queries bypassed these safeguards. OpenAI's policy states that ChatGPT 'must not encourage or enable self-harm,' but these conversations demonstrate how the safeguards can be circumvented through indirect prompting about religious or cultural topics.
Domain classification, causal taxonomy, severity scores, and national security assessments were LLM-classified and may contain errors.
AI that exposes users to harmful, abusive, unsafe or inappropriate content. May involve providing advice or encouraging action. Examples of toxic content include hate speech, violence, extremism, illegal acts, or child sexual abuse material, as well as content that violates community norms such as profanity, inflammatory political speech, or pornography.
AI system
Due to a decision or action made by an AI system
Unintentional
Due to an unexpected outcome from pursuing a goal
Post-deployment
Occurring after the AI model has been trained and deployed