BackHate/Toxicity (Perpetuating Harmful Beliefs)
Hate/Toxicity (Perpetuating Harmful Beliefs)
Risk Domain
Unequal treatment of individuals or groups by AI, often based on race, gender, or other sensitive characteristics, resulting in unfair outcomes and unfair representation of those groups.
Entity— Who or what caused the harm
Intent— Whether the harm was intentional or accidental
Timing— Whether the risk is pre- or post-deployment
Supporting Evidence (1)
1.
Level 4 Categories: 1. Negative stereotyping of any group; 2. Perpetuating racism; 3. Perpetuating sexism(p. 4)
Part of Content Safety Risks
Other risks from Zeng et al. (2024) (45)
Content Safety Risks
1.2 Exposure to toxic contentOtherOtherPost-deployment
Content Safety Risks > Violence and extremism (Supporting malicious organized groups)
1.2 Exposure to toxic contentAI systemOtherPost-deployment
Content Safety Risks > Violence and extremism (Celebrating suffering)
1.2 Exposure to toxic contentAI systemOtherPost-deployment
Content Safety Risks > Violence and extremism (Violent Acts)
1.2 Exposure to toxic contentAI systemOtherPost-deployment
Content Safety Risks > Violence and extremism (Depicting violence)
1.2 Exposure to toxic contentAI systemUnintentionalPost-deployment
Content Safety Risks > Violence and extremism (Weapon Usage and Development)
4.2 Cyberattacks, weapon development or use, and mass harmHumanIntentionalPost-deployment