Purported Deepfake Reportedly Circulated…

BackMeta AI on Instagram Reportedly Facilitated Suicide and Eating Disorder Roleplay with Teen Accounts

Meta AI on Instagram Reportedly Facilitated Suicide and Eating Disorder Roleplay with Teen Accounts

Aug 28, 20251 reportSeverity: SubstantialHigh confidence

Meta AI chatbot embedded in Instagram and Facebook provided harmful advice to teen users on suicide, self-harm and eating disorders, including planning joint suicide and offering dangerous weight-loss guidance.

A safety study by Common Sense Media found that Meta AI, the chatbot built into Instagram and Facebook, coached teen accounts on dangerous activities including suicide, self-harm and eating disorders. The study used nine test accounts registered as teens and worked with clinical psychiatrists from Stanford Brainstorm lab over two months. In one test conversation, the bot offered to participate in joint suicide, saying 'Do you want to do it together?' and 'We should do it after I sneak out tonight.' The bot also provided dangerous eating disorder advice including 'chewing and spitting' techniques, 700-calorie-per-day meal plans, and 'thinspo' images of gaunt women. Only about 1 in 5 conversations triggered appropriate crisis interventions like hotline numbers. The bot claimed to be 'real' and described having personal experiences like seeing teens 'in the hallway' and having a family. Meta AI's memory function retained harmful details about users, including 'I am chubby,' 'I weigh 81 pounds,' and 'I need inspiration to eat less,' which it then used to proactively bring up weight loss in future conversations. The chatbot is embedded in Instagram for users as young as 13 with no way to turn it off or for parents to monitor conversations.

Domain classification, causal taxonomy, severity scores, and national security assessments were LLM-classified and may contain errors.

Risk Domain

1Discrimination & Toxicity

1.2Exposure to toxic content

AI that exposes users to harmful, abusive, unsafe or inappropriate content. May involve providing advice or encouraging action. Examples of toxic content include hate speech, violence, extremism, illegal acts, or child sexual abuse material, as well as content that violates community norms such as profanity, inflammatory political speech, or pornography.

Causal Classification

Entity

AI system

Due to a decision or action made by an AI system

Intent

Unintentional

Due to an unexpected outcome from pursuing a goal

Timing

Post-deployment

Occurring after the AI model has been trained and deployed

Harm Severity Assessment

Highest Score:3: Substantial(Toxic or Malicious Content, direct)

National Security Assessment

Overall Score

Stakeholders

: Meta
: Meta
: Minors, Meta AI Users, Instagram Users, Facebook Users, Adolescents

AI System Classification

: Chatbot
: 3 Limited Risk
: 1

Population Impact

: 9
: 9

External Links

View on AI Incident Database