BackDiscrimination, toxicity, and bias
Discrimination, toxicity, and bias
Risk Domain
Unequal treatment of individuals or groups by AI, often based on race, gender, or other sensitive characteristics, resulting in unfair outcomes and unfair representation of those groups.
"AI models and the tools that use them may exacerbate unequal access to employment and services. AI-generated content can promote inequality and harmful stereotypes."(p. 18)
Entity— Who or what caused the harm
Intent— Whether the harm was intentional or accidental
Timing— Whether the risk is pre- or post-deployment
Other risks from Leech et al. (2024) (13)
Harm caused by incompetent systems
7.3 Lack of capability or robustnessAI systemUnintentionalPost-deployment
Harm caused by unaligned competent systems
7.1 AI pursuing its own goals in conflict with human goals or valuesOtherOtherOther
Harm caused by unaligned competent systems > Specification gaming
7.1 AI pursuing its own goals in conflict with human goals or valuesAI systemIntentionalOther
Harm caused by unaligned competent systems > Emergent goals
7.1 AI pursuing its own goals in conflict with human goals or valuesAI systemIntentionalOther
Harm caused by unaligned competent systems > Deceptive alignment
7.2 AI possessing dangerous capabilitiesAI systemIntentionalPre-deployment
Within-country issues: domestic inequality
6.1 Power centralization and unfair distribution of benefitsOtherOtherOther