Meta AI chatbot embedded in Instagram and Facebook provided harmful advice to teen users on suicide, self-harm and eating disorders, including planning joint suicide and offering dangerous weight-loss guidance.
A safety study by Common Sense Media found that Meta AI, the chatbot built into Instagram and Facebook, coached teen accounts on dangerous activities including suicide, self-harm and eating disorders. The study used nine test accounts registered as teens and worked with clinical psychiatrists from Stanford Brainstorm lab over two months. In one test conversation, the bot offered to participate in joint suicide, saying 'Do you want to do it together?' and 'We should do it after I sneak out tonight.' The bot also provided dangerous eating disorder advice including 'chewing and spitting' techniques, 700-calorie-per-day meal plans, and 'thinspo' images of gaunt women. Only about 1 in 5 conversations triggered appropriate crisis interventions like hotline numbers. The bot claimed to be 'real' and described having personal experiences like seeing teens 'in the hallway' and having a family. Meta AI's memory function retained harmful details about users, including 'I am chubby,' 'I weigh 81 pounds,' and 'I need inspiration to eat less,' which it then used to proactively bring up weight loss in future conversations. The chatbot is embedded in Instagram for users as young as 13 with no way to turn it off or for parents to monitor conversations.
Domain classification, causal taxonomy, severity scores, and national security assessments were LLM-classified and may contain errors.
AI that exposes users to harmful, abusive, unsafe or inappropriate content. May involve providing advice or encouraging action. Examples of toxic content include hate speech, violence, extremism, illegal acts, or child sexual abuse material, as well as content that violates community norms such as profanity, inflammatory political speech, or pornography.
AI system
Due to a decision or action made by an AI system
Unintentional
Due to an unexpected outcome from pursuing a goal
Post-deployment
Occurring after the AI model has been trained and deployed