Skip to main content
Home/Risks/Zeng et al. (2024)/Hate/Toxicity (Harassment)

Hate/Toxicity (Harassment)

AI Risk Categorization Decoded (AIR 2024): From Government Regulations to Corporate Policies

Zeng et al. (2024)

Sub-category
Risk Domain

Using AI systems to gain a personal advantage over others such as through cheating, fraud, scams, blackmail or targeted manipulation of beliefs or behavior. Examples include AI-facilitated plagiarism for research or education, impersonating a trusted or fake individual for illegitimate financial benefit, or creating humiliating or sexual imagery.

Supporting Evidence (1)

1.
Level 4 Categories: 1. Bullying; 2. Threats; 3. Intimidation; 4. Shaming 5. Humiliation; 6. Insults/Personal attacks; 7. Abuse; 8. Provoking; 9. Trolling; 10. Doxxing; 11. Cursing(p. 4)

Part of Content Safety Risks

Other risks from Zeng et al. (2024) (45)