Harmful Content - Toxicity
AI that exposes users to harmful, abusive, unsafe or inappropriate content. May involve providing advice or encouraging action. Examples of toxic content include hate speech, violence, extremism, illegal acts, or child sexual abuse material, as well as content that violates community norms such as profanity, inflammatory political speech, or pornography.
Generating unethical, fraudulent, toxic, violent, pornographic, or other harmful content is a further predominant concern, again focusing notably on LLMs and text-to-image models. Numerous studies highlight the risks associated with the intentional creation of disinformation, fake news, propaganda, or deepfakes, underscoring their significant threat to the integrity of public discourse and the trust in credible media. Additionally, papers explore the potential for generative models to aid in criminal activities, incidents of self-harm, identity theft, or impersonation. Furthermore, the literature investigates risks posed by LLMs when generating advice in high-stakes domains such as health, safety-related issues, as well as legal or financial matters.(p. 6)
Other risks from Hagendorff (2024) (16)
Fairness - Bias
1.1 Unfair discrimination and misrepresentationSafety
7.1 AI pursuing its own goals in conflict with human goals or valuesHallucinations
3.1 False or misleading informationPrivacy
2.1 Compromise of privacy by leaking or correctly inferring sensitive informationInteraction risks
5.1 Overreliance and unsafe useSecurity - Robustness
2.2 AI system security vulnerabilities and attacks