DALL-E 2 Reported for Gender and Racially Biased Outputs

Apr 1, 20223 reportsSeverity: MinorToolHigh confidence

OpenAI's DALL-E 2 AI image generation system was found to exhibit gender and racial biases, generating stereotypical images when prompted with generic terms like 'lawyer' or 'flight attendant', despite the company's attempts to mitigate these issues through data filtering and prompt modification.

OpenAI developed DALL-E 2, an AI system that generates images from text descriptions, and released it in limited preview to approximately 400 trusted users in 2022. The system was trained on millions of images scraped from the internet paired with captions. Despite OpenAI's efforts to filter explicit content and mitigate bias, DALL-E 2 exhibited significant gender and racial stereotypes - generating predominantly white, male-presenting people for prompts like 'lawyer' and female-presenting people for 'flight attendant'. OpenAI attempted to address these biases by filtering training data and secretly modifying user prompts to add demographic descriptors like 'black' or 'female', but these interventions created trade-offs such as reducing overall representation of women when sexual content was filtered. External researchers conducting red-team evaluations in February-March 2022 identified these bias issues. The system's biased outputs reinforced harmful stereotypes and could contribute to discriminatory representations in generated content, affecting how different demographic groups are portrayed and perceived.

Domain classification, causal taxonomy, severity scores, and national security assessments were LLM-classified and may contain errors.

Risk Domain

1Discrimination & Toxicity

1.1Unfair discrimination and misrepresentation

Unequal treatment of individuals or groups by AI, often based on race, gender, or other sensitive characteristics, resulting in unfair outcomes and unfair representation of those groups.

Causal Classification

Entity

AI system

Due to a decision or action made by an AI system

Intent

Unintentional

Due to an unexpected outcome from pursuing a goal

Timing

Post-deployment

Occurring after the AI model has been trained and deployed

Harm Severity Assessment

Highest Score:2: Minor(Differential Treatment, direct)

National Security Assessment

Overall Score

Stakeholders

: OpenAI
: OpenAI
: Underrepresented Groups, Minority Groups

AI System Classification

: Image Generation
: Tool
: 3 Limited Risk
: 1

Population Impact

: 400
: 400

External Links

View on AI Incident Database