Social media platforms Facebook and Twitter failed to adequately moderate hate speech, harassment, and violent content in Balkan languages, with AI content moderation systems performing poorly on non-English content and leaving harmful material online despite user reports.
A study by the Balkan Investigative Reporting Network (BIRN) examined content moderation effectiveness on Facebook and Twitter for hate speech, harassment, and violent content in Balkan languages. The investigation found that while 57% of hate speech reports resulted in violation confirmations, 28% were deemed non-violations, and many accounts confirmed as violating rules remained online. For targeted harassment, 50% received violation confirmations while 16% were told content did not violate rules. For threatening violence, only 40% received violation confirmations while 60% received only acknowledgment. One respondent reported seven accounts for hate and violent content - though Twitter confirmed violations, six accounts remained available online. Facebook's proactive hate speech detection improved from 23.6% in late 2017 to 95% currently, but no language-specific data was provided. The platforms rely on AI systems for content moderation, but experts noted these perform poorly on non-English languages, particularly those using non-Roman scripts. A specific incident occurred in May 2018 when Facebook blocked Bosnian journalist Dragan Bursac for 24 hours after posting a historical photo of a detention camp, determining it violated community standards. The study highlighted that smaller language groups like those in the former Yugoslavia lack sufficient user numbers to incentivize investment in human moderation, leading to inadequate AI-only approaches for these languages.
Domain classification, causal taxonomy, severity scores, and national security assessments were LLM-classified and may contain errors.
AI systems that fail to perform reliably or effectively under varying conditions, exposing them to errors and failures that can have significant consequences, especially in critical applications or areas that require moral reasoning.
AI system
Due to a decision or action made by an AI system
Unintentional
Due to an unexpected outcome from pursuing a goal
Post-deployment
Occurring after the AI model has been trained and deployed