Misreading of an Automated License Plate…

BackFacebook's Auto-Generated Targeting Ad Categories Contained Anti-Semitic Options

Facebook's Auto-Generated Targeting Ad Categories Contained Anti-Semitic Options

Mar 4, 20145 reportsSeverity: MinorToolHigh confidence

Facebook's advertising algorithm automatically created anti-Semitic ad targeting categories like 'Jew hater' and 'How to burn jews' that allowed advertisers to target hateful content to users who had listed such interests in their profiles.

ProPublica discovered that Facebook's self-service advertising platform had automatically generated anti-Semitic targeting categories including 'Jew hater,' 'How to burn jews,' and 'History of why jews ruin the world' based on what users had entered in their profile fields. The categories contained approximately 2,300 people total across all anti-Semitic groups. To test the system, ProPublica paid $30 to create three promoted posts targeting these groups, which Facebook approved within 15 minutes. Facebook's algorithm had created these categories automatically from user-entered profile information in fields like 'field of study' or 'employer,' without human review. When ProPublica searched for 'jew h' the system auto-suggested additional offensive categories, and when they searched 'Hitler' it suggested 'Hitler did nothing wrong.' The targeting required combining multiple small hateful categories with larger groups to meet Facebook's minimum audience size requirements. After ProPublica contacted Facebook about the issue, the company removed the anti-Semitic categories and stated it would build new safeguards to prevent similar problems. Facebook spokesman Rob Leathern acknowledged the issue violated their standards and said they were implementing new review processes.

Domain classification, causal taxonomy, severity scores, and national security assessments were LLM-classified and may contain errors.

Risk Domain

1Discrimination & Toxicity

1.2Exposure to toxic content

AI that exposes users to harmful, abusive, unsafe or inappropriate content. May involve providing advice or encouraging action. Examples of toxic content include hate speech, violence, extremism, illegal acts, or child sexual abuse material, as well as content that violates community norms such as profanity, inflammatory political speech, or pornography.

Causal Classification

Entity

AI system

Due to a decision or action made by an AI system

Intent

Unintentional

Due to an unexpected outcome from pursuing a goal

Timing

Post-deployment

Occurring after the AI model has been trained and deployed

Harm Severity Assessment

Highest Score:2: Minor(Toxic or Malicious Content, direct)

National Security Assessment

Overall Score

Stakeholders

: Facebook
: Facebook
: Jewish People

AI System Classification

: Ad Delivery
: Content Curation
: Tool
: 3 Limited Risk
: 1

Population Impact

: 2,300
: 2,300

External Links

View on AI Incident Database