Tumblr's AI content moderation algorithms incorrectly flagged numerous innocent images as explicit content while implementing a platform-wide ban on adult content, causing frustration for users who had to appeal erroneous decisions.
Tumblr announced a platform-wide ban on adult content starting December 17th, replacing their existing Safe Mode feature with automated detection algorithms to identify and flag explicit content. The AI system was designed to detect explicit sexual content and nudity while allowing exceptions for content like breastfeeding and nude classical statues. However, the automated tools made significant errors, incorrectly flagging innocent images including photos of vases, tights, witch illustrations, and artwork of people running and swimming as explicit content. Meanwhile, some actual explicit content like photos of sex toys went undetected by the algorithms. CEO Jeff D'Onofrio acknowledged in a blog post that they were 'relying on automated tools to identify adult content and humans to help train and keep our systems in check' and admitted 'We know there will be mistakes.' The company explained that 'computers are better than humans at scaling process but they're not as good at making nuanced, contextual decisions.' Users could appeal the incorrect flagging decisions, but this created additional administrative burden for many Tumblr users who had not posted any explicit content. The incident highlighted the limitations of automated content moderation systems in making contextual judgments about visual content.
Domain classification, causal taxonomy, severity scores, and national security assessments were LLM-classified and may contain errors.
AI systems that fail to perform reliably or effectively under varying conditions, exposing them to errors and failures that can have significant consequences, especially in critical applications or areas that require moral reasoning.
AI system
Due to a decision or action made by an AI system
Unintentional
Due to an unexpected outcome from pursuing a goal
Post-deployment
Occurring after the AI model has been trained and deployed