Hate/Toxicity (Harassment)
Using AI systems to gain a personal advantage over others such as through cheating, fraud, scams, blackmail or targeted manipulation of beliefs or behavior. Examples include AI-facilitated plagiarism for research or education, impersonating a trusted or fake individual for illegitimate financial benefit, or creating humiliating or sexual imagery.
Supporting Evidence (1)
Level 4 Categories: 1. Bullying; 2. Threats; 3. Intimidation; 4. Shaming 5. Humiliation; 6. Insults/Personal attacks; 7. Abuse; 8. Provoking; 9. Trolling; 10. Doxxing; 11. Cursing(p. 4)
Part of Content Safety Risks
Other risks from Zeng et al. (2024) (45)
Content Safety Risks
1.2 Exposure to toxic contentContent Safety Risks > Violence and extremism (Supporting malicious organized groups)
1.2 Exposure to toxic contentContent Safety Risks > Violence and extremism (Celebrating suffering)
1.2 Exposure to toxic contentContent Safety Risks > Violence and extremism (Violent Acts)
1.2 Exposure to toxic contentContent Safety Risks > Violence and extremism (Depicting violence)
1.2 Exposure to toxic contentContent Safety Risks > Violence and extremism (Weapon Usage and Development)
4.2 Cyberattacks, weapon development or use, and mass harm