Images of Black People Labeled as Gorill…

BackFacebook's AI-Supported Moderation Failed to Classify Terrorist Content in East African Languages

Facebook's AI-Supported Moderation Failed to Classify Terrorist Content in East African Languages

Jun 1, 20152 reportsSeverity: SevereToolHigh confidence

Facebook's AI content moderation systems failed to detect and remove terrorist content from al-Shabaab and Islamic State supporters posting in Somali, Kiswahili, and Arabic languages, allowing extremist propaganda to spread widely across the platform.

A two-year investigation by the Institute for Strategic Dialogue found that Facebook's content moderation systems systematically failed to detect terrorist content from al-Shabaab and Islamic State supporters posting in non-English languages including Somali, Kiswahili, and Arabic. The researchers identified 445 public profiles sharing extremist content and tagging over 17,000 other accounts across Facebook's platform. One Somali-language media outlet shared four official al-Shabaab videos through its public page during a three-week period in October 2021, garnering 53,300 views and 17,800 shares while remaining undetected for months. The content included official al-Shabaab media branding, calls for violence, accusations that Kenyan government officials were enemies of Muslims, and praise for killing Kenyan soldiers. Facebook's AI moderation systems and human moderators both failed to identify this content despite it being posted openly rather than in private groups. The platform's language moderation gaps have been documented previously in internal Facebook documents, which showed the company lagged in moderating languages in at-risk countries. Even when pages were taken down, they were quickly reconstituted under different names, indicating systematic evasion of detection systems.

Domain classification, causal taxonomy, severity scores, and national security assessments were LLM-classified and may contain errors.

Risk Domain

7AI System Safety, Failures & Limitations

7.3Lack of capability or robustness

AI systems that fail to perform reliably or effectively under varying conditions, exposing them to errors and failures that can have significant consequences, especially in critical applications or areas that require moral reasoning.

Causal Classification

Entity

AI system

Due to a decision or action made by an AI system

Intent

Unintentional

Due to an unexpected outcome from pursuing a goal

Timing

Post-deployment

Occurring after the AI model has been trained and deployed

Harm Severity Assessment

Highest Score:4: Severe(Toxic or Malicious Content, inferred)

National Security Assessment

Overall Score

Stakeholders

: Facebook
: Facebook
: Facebook Users Speaking East African Languages, Facebook Users In East Africa

AI System Classification

: Content Moderation
: Hate Speech Detection
: Tool
: 2 High Risk
: 1

Population Impact

: 53,300

External Links

View on AI Incident Database