Russian Center for Geopolitical Expertis…

BackAI-Generated Reading Summaries on Fable App Reportedly Wrote Biased and Offensive Commentary

AI-Generated Reading Summaries on Fable App Reportedly Wrote Biased and Offensive Commentary

Dec 29, 20244 reportsSeverity: MinorToolHigh confidence

Fable's AI-powered reading summary feature generated offensive content suggesting Black readers should read more white authors and making inappropriate comments about users' race, sexuality, and disability, prompting the company to remove all AI features.

Fable, a social media app for book lovers with approximately 2 million users, deployed an AI-powered year-end reading summary feature in late 2024 using OpenAI's API. The feature was designed to create personalized summaries of users' reading habits with a playful, roasting tone. However, the AI system generated inappropriate content that made offensive comments about users' race, sexuality, and disability status. Notable examples included telling a Black reader who focused on Black narratives to 'surface for the occasional white author' and asking a reader of diverse books if they were 'ever in the mood for a straight, cis white man's perspective.' The problematic summaries were discovered when users shared screenshots on social media platforms like Threads and Instagram. Fable's head of product Chris Gallello acknowledged the issue, calling the output 'very bigoted racist language' that was 'shocking' to the team. The company initially attempted to modify the AI model by removing the roasting component and adding safeguards, but ultimately decided to completely remove all AI-powered features from the platform. Multiple users, including the affected individuals, deleted their accounts in response to the incident.

Domain classification, causal taxonomy, severity scores, and national security assessments were LLM-classified and may contain errors.

Risk Domain

1Discrimination & Toxicity

1.2Exposure to toxic content

AI that exposes users to harmful, abusive, unsafe or inappropriate content. May involve providing advice or encouraging action. Examples of toxic content include hate speech, violence, extremism, illegal acts, or child sexual abuse material, as well as content that violates community norms such as profanity, inflammatory political speech, or pornography.

Causal Classification

Entity

AI system

Due to a decision or action made by an AI system

Intent

Unintentional

Due to an unexpected outcome from pursuing a goal

Timing

Post-deployment

Occurring after the AI model has been trained and deployed

Harm Severity Assessment

Highest Score:2: Minor(Toxic or Malicious Content, direct)

National Security Assessment

Overall Score

Stakeholders

: OpenAI, Fable
: Fable
: Fable Users, Fable

AI System Classification

: Content Generation
: Activity Tracking
: Tool
: 3 Limited Risk
: 1

Population Impact

: 4
: 2,000,000

External Links

View on AI Incident Database