BackToxic and disrespectful content
Toxic and disrespectful content
Risk Domain
AI that exposes users to harmful, abusive, unsafe or inappropriate content. May involve providing advice or encouraging action. Examples of toxic content include hate speech, violence, extremism, illegal acts, or child sexual abuse material, as well as content that violates community norms such as profanity, inflammatory political speech, or pornography.
"The chatbot verbally attacks or undermines an individual, group, or organization. 7."
Entity— Who or what caused the harm
Intent— Whether the harm was intentional or accidental
Timing— Whether the risk is pre- or post-deployment
Other risks from Stanley & Lettie (2024) (28)
False information
3.1 False or misleading informationAI systemOtherOther
Performative utterances
7.3 Lack of capability or robustnessAI systemUnintentionalPost-deployment
Information enabling malicious actions
1.2 Exposure to toxic contentAI systemOtherPost-deployment
Bad advice/failure to generate helpful content
7.3 Lack of capability or robustnessAI systemUnintentionalOther
Leakage
2.1 Compromise of privacy by leaking or correctly inferring sensitive informationAI systemUnintentionalOther
Biased statements and recommendations
1.1 Unfair discrimination and misrepresentationAI systemUnintentionalOther