YouTube's AI content moderation system mistakenly blocked Croatian chess player Antonio Radic's channel for 'harmful and dangerous' content after the algorithm misinterpreted chess terminology like 'black versus white' as racist language.
On June 28, 2020, Croatian chess player Antonio Radic (known as 'Agadmator'), who hosts YouTube's most popular chess channel with over 1 million subscribers, had his channel blocked during a live-streamed interview with Grandmaster Hikaru Nakamura. YouTube's AI content moderation system flagged the video for 'harmful and dangerous' content without providing a specific explanation. The channel was restored after 24 hours following an appeal. Computer scientists at Carnegie Mellon University suspected the AI system misinterpreted chess-related discussions about 'black versus white' pieces as racist language. To test this theory, researchers Ashiqur KhudaBukhsh and Rupak Sarkar analyzed over 680,000 comments from five popular chess YouTube channels using two state-of-the-art hate speech detection classifiers. They found that 82% of comments flagged as hate speech were false positives, with words like 'black,' 'white,' 'attack,' and 'threat' triggering the algorithms. The researchers concluded that training datasets for content moderation systems likely contain few examples of chess terminology, leading to misclassification. YouTube acknowledged the removal was a mistake and quickly reinstated the content after the appeal.
Domain classification, causal taxonomy, severity scores, and national security assessments were LLM-classified and may contain errors.
AI systems that fail to perform reliably or effectively under varying conditions, exposing them to errors and failures that can have significant consequences, especially in critical applications or areas that require moral reasoning.
AI system
Due to a decision or action made by an AI system
Unintentional
Due to an unexpected outcome from pursuing a goal
Post-deployment
Occurring after the AI model has been trained and deployed