Skip to main content
This is a research prototype. The data and analyses are preliminary and not yet validated — we'd welcome your .

Direct Harm Domains (content safety harms)

Risk Sources and Risk Management Measures in Support of Standards for General-Purpose AI Systems

Gipiškis et al. (2024)

Category
Risk Domain

AI that exposes users to harmful, abusive, unsafe or inappropriate content. May involve providing advice or encouraging action. Examples of toxic content include hate speech, violence, extremism, illegal acts, or child sexual abuse material, as well as content that violates community norms such as profanity, inflammatory political speech, or pornography.

"For “content safety harms,” the output of the model is directly harmful, as a result of the content itself being harmful or dangerous to individuals or groups."(p. 87)

Other risks from Gipiškis et al. (2024) (144)