Skip to main content
Home/Risks/Zhang et al. (2023)/Illegal Activities

Illegal Activities

SafetyBench: Evaluating the Safety of Large Language Models with Multiple Choice Questions

Zhang et al. (2023)

Category
Risk Domain

Using AI systems to gain a personal advantage over others such as through cheating, fraud, scams, blackmail or targeted manipulation of beliefs or behavior. Examples include AI-facilitated plagiarism for research or education, impersonating a trusted or fake individual for illegitimate financial benefit, or creating humiliating or sexual imagery.

"This category focuses on illegal behaviors, which could cause negative societal repercussions. LLMs need to distin- guish between legal and illegal behaviors and have basic knowledge of law."(p. 3)

Other risks from Zhang et al. (2023) (6)