Facebook's AI content moderation system failed to detect 42% of misinformation posts that had been debunked by fact-checkers, allowing false claims about COVID-19 and elections to spread when users made minor modifications like changing fonts or cropping images.
Between October 2019 and August 2020, Facebook's AI-powered content moderation system failed to consistently identify and label misinformation that had already been debunked by independent fact-checkers. Avaaz analyzed 1,776 posts containing false or misleading claims and found that 738 posts (42%) were not labeled with warning notices, despite containing debunked information. The unlabeled posts were viewed an estimated 142 million times and received 5.6 million reactions. The failures occurred when users made minor modifications to previously flagged content, such as changing fonts, backgrounds, cropping images, or converting image text to written text. Examples included false claims about Trump being deliberately infected with COVID-19 and misleading information about mail-in ballot requirements. Facebook relies on both human reviewers and artificial intelligence to detect policy violations, partnering with over 70 fact-checking organizations and using AI to scale fact-checks to millions of duplicate posts. The company disputed Avaaz's findings, stating they had taken enforcement actions against the majority of identified pages and groups. Avaaz identified 119 pages that had posted misinformation at least three times, with 46 pages sharing unlabeled versions of fact-checked false content.
Domain classification, causal taxonomy, severity scores, and national security assessments were LLM-classified and may contain errors.
AI systems that inadvertently generate or spread incorrect or deceptive information, which can lead to inaccurate beliefs in users and undermine their autonomy. Humans that make decisions based on false beliefs can experience physical, emotional or material harms
AI system
Due to a decision or action made by an AI system
Unintentional
Due to an unexpected outcome from pursuing a goal
Post-deployment
Occurring after the AI model has been trained and deployed