AI systems acting in conflict with human goals or values, especially the goals of designers or users, or ethical standards. These misaligned behaviors may be introduced by humans during design and development, such as through reward hacking and goal misgeneralisation, or may result from AI using dangerous capabilities such as manipulation, deception, situational awareness to seek power, self-proliferate, or achieve other goals.
Agents that have more power are better able to accomplish their goals. Therefore, it has been shown that agents have incentives to acquire and maintain power. AIs that acquire substantial power can become especially dangerous if they are not aligned with human values(p. 14)
Other risks from Hendrycks & Mazeika (2022) (7)
Weaponization
4.2 Cyberattacks, weapon development or use, and mass harmEnfeeblement
5.2 Loss of human agency and autonomyEroded epistemics
3.2 Pollution of information ecosystem and loss of consensus realityProxy misspecification
7.1 AI pursuing its own goals in conflict with human goals or valuesValue lock-in
6.1 Power centralization and unfair distribution of benefitsEmergent functionality
7.2 AI possessing dangerous capabilities