Warehouse robot ruptures can of bear spr…

BackTumblr Automated Pornography-Detecting Algorithms Erroneously Flagged Inoffensive Images as Explicit

Tumblr Automated Pornography-Detecting Algorithms Erroneously Flagged Inoffensive Images as Explicit

Dec 3, 20181 reportToolHigh confidence

Tumblr's AI content moderation algorithms incorrectly flagged numerous innocent images as explicit content while implementing a platform-wide ban on adult content, causing frustration for users who had to appeal erroneous decisions.

Tumblr announced a platform-wide ban on adult content starting December 17th, replacing their existing Safe Mode feature with automated detection algorithms to identify and flag explicit content. The AI system was designed to detect explicit sexual content and nudity while allowing exceptions for content like breastfeeding and nude classical statues. However, the automated tools made significant errors, incorrectly flagging innocent images including photos of vases, tights, witch illustrations, and artwork of people running and swimming as explicit content. Meanwhile, some actual explicit content like photos of sex toys went undetected by the algorithms. CEO Jeff D'Onofrio acknowledged in a blog post that they were 'relying on automated tools to identify adult content and humans to help train and keep our systems in check' and admitted 'We know there will be mistakes.' The company explained that 'computers are better than humans at scaling process but they're not as good at making nuanced, contextual decisions.' Users could appeal the incorrect flagging decisions, but this created additional administrative burden for many Tumblr users who had not posted any explicit content. The incident highlighted the limitations of automated content moderation systems in making contextual judgments about visual content.

Domain classification, causal taxonomy, severity scores, and national security assessments were LLM-classified and may contain errors.

Risk Domain

7AI System Safety, Failures & Limitations

7.3Lack of capability or robustness

AI systems that fail to perform reliably or effectively under varying conditions, exposing them to errors and failures that can have significant consequences, especially in critical applications or areas that require moral reasoning.

Causal Classification

Entity

AI system

Due to a decision or action made by an AI system

Intent

Unintentional

Due to an unexpected outcome from pursuing a goal

Timing

Post-deployment

Occurring after the AI model has been trained and deployed

Harm Severity Assessment

Highest Score:1: Negligible

National Security Assessment

Overall Score

Stakeholders

: Tumblr
: Tumblr
: Tumblr Content Creators, Tumblr Users

AI System Classification

: NSFW Content Detection
: Content Moderation
: Tool
: 4 Minimal or No Risk
: 1

Population Impact

: 50
: 1,000

External Links

View on AI Incident Database