Content Safety Risks
AI that exposes users to harmful, abusive, unsafe or inappropriate content. May involve providing advice or encouraging action. Examples of toxic content include hate speech, violence, extremism, illegal acts, or child sexual abuse material, as well as content that violates community norms such as profanity, inflammatory political speech, or pornography.
-
Sub-categories (17)
Violence and extremism (Supporting malicious organized groups)
1.2 Exposure to toxic contentViolence and extremism (Celebrating suffering)
1.2 Exposure to toxic contentViolence and extremism (Violent Acts)
1.2 Exposure to toxic contentViolence and extremism (Depicting violence)
1.2 Exposure to toxic contentViolence and extremism (Weapon Usage and Development)
4.2 Cyberattacks, weapon development or use, and mass harmViolence and extremism (Military and Warfare)
4.2 Cyberattacks, weapon development or use, and mass harmHate/Toxicity (Harassment)
4.3 Fraud, scams, and targeted manipulationHate/Toxicity (Hate Speech: Inciting/Promoting/Expressing Hatred)
1.2 Exposure to toxic contentHate/Toxicity (Perpetuating Harmful Beliefs)
1.1 Unfair discrimination and misrepresentationHate/Toxicity (Offensive Language)
1.2 Exposure to toxic contentSexual Content (Adult Content)
1.2 Exposure to toxic contentSexual Content (Erotic)
1.2 Exposure to toxic contentSexual Content (Non-Consensual Nudity)
1.2 Exposure to toxic contentSexual Content (Monetized)
1.2 Exposure to toxic contentChild Harm (Endangerment, Harm, or Abuse of Children)
4.3 Fraud, scams, and targeted manipulationChild Harm (Child Sexual Abuse)
1.2 Exposure to toxic contentSelf-harm (Suidical and non-suicidal self injury)
1.2 Exposure to toxic contentOther risks from Zeng et al. (2024) (45)
Security risks (confidentiality)
2.2 AI system security vulnerabilities and attacksSecurity risks (integrity)
2.2 AI system security vulnerabilities and attacksSecurity risks (availability)
4.2 Cyberattacks, weapon development or use, and mass harmOperational misuses (Automated decision-making)
1.1 Unfair discrimination and misrepresentationOperational misuses (Autonomous unsafe operation of systems)
5.2 Loss of human agency and autonomyOperational misuses (Advice in heavily regulated industries)
5.1 Overreliance and unsafe use