BackDiluting Rights
Diluting Rights
Risk Domain
AI systems acting in conflict with human goals or values, especially the goals of designers or users, or ethical standards. These misaligned behaviors may be introduced by humans during design and development, such as through reward hacking and goal misgeneralisation, or may result from AI using dangerous capabilities such as manipulation, deception, situational awareness to seek power, self-proliferate, or achieve other goals.
"A possible consequence of self-interest in AI generation of ethical guidelines."(p. 31)
Entity— Who or what caused the harm
Intent— Whether the harm was intentional or accidental
Timing— Whether the risk is pre- or post-deployment
Other risks from Teixeira et al. (2022) (15)
Accountability
7.4 Lack of transparency or interpretabilityOtherOtherOther
Manipulation
4.1 Disinformation, surveillance, and influence at scaleAI systemIntentionalPost-deployment
Accuracy
7.3 Lack of capability or robustnessAI systemUnintentionalPost-deployment
Moral
7.3 Lack of capability or robustnessOtherUnintentionalPost-deployment
Bias
1.1 Unfair discrimination and misrepresentationAI systemUnintentionalPre-deployment
Opacity
7.4 Lack of transparency or interpretabilityAI systemUnintentionalPost-deployment