Independent News Sites Flagged as Spam by Facebook's AI Moderation System

Jun 12, 20241 reportSeverity: MinorHigh confidence

Meta's AI content moderation system incorrectly flagged legitimate news posts from international publishers as spam, causing widespread removal of standard journalism content across multiple countries.

Starting around May 25, 2024, Meta's automated content moderation system began incorrectly flagging and removing legitimate news posts from independent publishers across the US, Europe, and UK as spam. The system affected publishers in Pennsylvania, Poland, Czech Republic, Portugal, Slovenia, and the UK, with posts linking to standard local news articles about politics, elections, crime, businesses, and community events being marked as violating Facebook's Community Standards on spam. Publishers reported that more than 20 posts were removed in some cases, with notifications claiming they 'tried to get likes, follows, shares or video views in a misleading way.' Most affected publishers were small, independent local news outlets heavily reliant on Facebook for traffic distribution, with some receiving 30-50% of their website traffic from social media. While some posts were restored after review, many publishers struggled to contact Meta for resolution, and some received outdated notifications referencing Covid-19 staffing issues. The timing was particularly problematic as it occurred during the run-up to the UK general election, raising concerns about potential impacts on balanced political coverage.

Domain classification, causal taxonomy, severity scores, and national security assessments were LLM-classified and may contain errors.

Risk Domain

7AI System Safety, Failures & Limitations

7.3Lack of capability or robustness

AI systems that fail to perform reliably or effectively under varying conditions, exposing them to errors and failures that can have significant consequences, especially in critical applications or areas that require moral reasoning.

Causal Classification

Entity

AI system

Due to a decision or action made by an AI system

Intent

Unintentional

Due to an unexpected outcome from pursuing a goal

Timing

Post-deployment

Occurring after the AI model has been trained and deployed

Harm Severity Assessment

Highest Score:2: Minor(Differential Treatment, direct)

National Security Assessment

Overall Score

Stakeholders

: Meta, Facebook
: Meta, Facebook
: The Record Argus, Wlkp24.info, Top Fight, Noticias Maia, City Magazine, Birkenhead News, Bishop's Stortford Independent, UK Defence Journal

AI System Classification

: Content Moderation
: Spam Filtering
: 4 Minimal or No Risk
: 1

Population Impact

: 10
: 1,000,000

External Links

View on AI Incident Database