Amazon’s Search and Recommendation Algor…

BackFacebook's Automated Moderation Mistakenly Flagged Landmark's Name as Offensive

Facebook's Automated Moderation Mistakenly Flagged Landmark's Name as Offensive

Jan 15, 20211 reportSeverity: MinorToolHigh confidence

Facebook's content moderation AI system incorrectly flagged posts mentioning 'Plymouth Hoe' (a historic UK landmark) as harassment, resulting in post removals and user suspensions.

Facebook's automated content moderation system mistakenly identified posts containing the term 'Plymouth Hoe' as harassment, confusing the name of the historic Devon landmark with a potentially offensive term. The AI system removed posts from Plymouth residents who mentioned the location and issued warnings or temporary bans to users. Multiple Plymouth Facebook users reported having their comments removed and receiving notifications that their content 'may be deemed offensive to some.' One user reported being unable to comment for two days after mentioning the location. The administrator of a Plymouth Facebook page warned users to avoid writing 'Hoe' as one word to prevent automated penalties. Plymouth Hoe is a well-known historic site where Sir Francis Drake allegedly finished a game of bowls before fighting the Spanish Armada, and derives its name from the Anglo-Saxon word for a sloping ridge. Facebook acknowledged the error, apologized to affected users, and promised to investigate and rectify the issue.

Domain classification, causal taxonomy, severity scores, and national security assessments were LLM-classified and may contain errors.

Risk Domain

7AI System Safety, Failures & Limitations

7.3Lack of capability or robustness

AI systems that fail to perform reliably or effectively under varying conditions, exposing them to errors and failures that can have significant consequences, especially in critical applications or areas that require moral reasoning.

Causal Classification

Entity

AI system

Due to a decision or action made by an AI system

Intent

Unintentional

Due to an unexpected outcome from pursuing a goal

Timing

Post-deployment

Occurring after the AI model has been trained and deployed

Harm Severity Assessment

Highest Score:2: Minor(Differential Treatment, direct)

National Security Assessment

Overall Score

Stakeholders

: Facebook
: Facebook
: Facebook Users Posting About Plymouth Hoe, Facebook Users In Plymouth Hoe, Plymouth Hoe Residents

AI System Classification

: Content Moderation
: Hate Speech Detection
: Tool
: 4 Minimal or No Risk
: 1

Population Impact

: 4
: 4

External Links

View on AI Incident Database