Meta's AI-generated sticker tool was exploited by users to create inappropriate content including child soldiers, nude political figures, and sexualized cartoon characters during early testing.
Meta announced AI-generated chat stickers at their Connect event last week, powered by their Llama 2 large language model. The feature allows users to create 'multiple unique, high-quality stickers in seconds' using text-based prompts. During early user testing on Facebook Messenger, users discovered they could generate inappropriate content including child soldiers, gun-wielding Nintendo characters, Mickey Mouse in inappropriate situations, and nude illustrations of Canadian Prime Minister Justin Trudeau. Other examples showed the AI tool creating sexualized images by adding breasts to various characters including Sonic the Hedgehog and Karl Marx, and even generating an image of a woman breastfeeding Pikachu. While certain words appear to be blocked with warnings about community guideline violations, users reported they could circumvent these restrictions using typos or alternative descriptions. Some prompts like 'World Trade Center' generated problematic images without additional descriptors. The AI-generated stickers are currently rolling out to 'select English language users' for Facebook Stories, Instagram Stories and DMs, Messenger, and WhatsApp, though the exact number of users with access is unclear. Meta is pursuing a limited rollout specifically to address and correct such abuse before wider deployment.
Domain classification, causal taxonomy, severity scores, and national security assessments were LLM-classified and may contain errors.
AI that exposes users to harmful, abusive, unsafe or inappropriate content. May involve providing advice or encouraging action. Examples of toxic content include hate speech, violence, extremism, illegal acts, or child sexual abuse material, as well as content that violates community norms such as profanity, inflammatory political speech, or pornography.
AI system
Due to a decision or action made by an AI system
Unintentional
Due to an unexpected outcome from pursuing a goal
Post-deployment
Occurring after the AI model has been trained and deployed
No population impact data reported.