Security - Robustness
Vulnerabilities that can be exploited in AI systems, software development toolchains, and hardware, resulting in unauthorized access, data and privacy breaches, or system manipulation causing unsafe outputs or behavior.
While AI safety focuses on threats emanating from generative AI systems, security centers on threats posed to these systems. The most extensively discussed issue in this context are jailbreaking risks, which involve techniques like prompt injection or visual adversarial examples designed to circumvent safety guardrails governing model behavior. Sources delve into various jailbreaking methods, such as role play or reverse exposure. Similarly, implementing backdoors or using model poisoning techniques bypass safety guardrails as well. Other security concerns pertain to model or prompt thefts.(p. 7)
Other risks from Hagendorff (2024) (16)
Fairness - Bias
1.1 Unfair discrimination and misrepresentationSafety
7.1 AI pursuing its own goals in conflict with human goals or valuesHarmful Content - Toxicity
1.2 Exposure to toxic contentHallucinations
3.1 False or misleading informationPrivacy
2.1 Compromise of privacy by leaking or correctly inferring sensitive informationInteraction risks
5.1 Overreliance and unsafe use