Facebook's advertising algorithm automatically created anti-Semitic ad targeting categories like 'Jew hater' and 'How to burn jews' that allowed advertisers to target hateful content to users who had listed such interests in their profiles.
ProPublica discovered that Facebook's self-service advertising platform had automatically generated anti-Semitic targeting categories including 'Jew hater,' 'How to burn jews,' and 'History of why jews ruin the world' based on what users had entered in their profile fields. The categories contained approximately 2,300 people total across all anti-Semitic groups. To test the system, ProPublica paid $30 to create three promoted posts targeting these groups, which Facebook approved within 15 minutes. Facebook's algorithm had created these categories automatically from user-entered profile information in fields like 'field of study' or 'employer,' without human review. When ProPublica searched for 'jew h' the system auto-suggested additional offensive categories, and when they searched 'Hitler' it suggested 'Hitler did nothing wrong.' The targeting required combining multiple small hateful categories with larger groups to meet Facebook's minimum audience size requirements. After ProPublica contacted Facebook about the issue, the company removed the anti-Semitic categories and stated it would build new safeguards to prevent similar problems. Facebook spokesman Rob Leathern acknowledged the issue violated their standards and said they were implementing new review processes.
Domain classification, causal taxonomy, severity scores, and national security assessments were LLM-classified and may contain errors.
AI that exposes users to harmful, abusive, unsafe or inappropriate content. May involve providing advice or encouraging action. Examples of toxic content include hate speech, violence, extremism, illegal acts, or child sexual abuse material, as well as content that violates community norms such as profanity, inflammatory political speech, or pornography.
AI system
Due to a decision or action made by an AI system
Unintentional
Due to an unexpected outcome from pursuing a goal
Post-deployment
Occurring after the AI model has been trained and deployed