Acquisition of a goal to harm society
Using AI systems to develop cyber weapons (e.g., by coding cheaper, more effective malware), develop new or enhance existing weapons (e.g., Lethal Autonomous Weapons or chemical, biological, radiological, nuclear, and high-yield explosives), or use weapons to cause mass harm.
"cases of AI systems being given the outright goal of harming humanity (ChaosGPT);"(p. 29)
Part of Dangerous capabilities in AI systems
Other risks from Maas (2023) (25)
Alignment failures in existing ML systems
7.1 AI pursuing its own goals in conflict with human goals or valuesAlignment failures in existing ML systems > Faulty reward functions in the wild
7.1 AI pursuing its own goals in conflict with human goals or valuesAlignment failures in existing ML systems > Specification gaming
7.1 AI pursuing its own goals in conflict with human goals or valuesAlignment failures in existing ML systems > Reward model overoptimization
7.1 AI pursuing its own goals in conflict with human goals or valuesAlignment failures in existing ML systems > Instrumental convergence
7.1 AI pursuing its own goals in conflict with human goals or valuesAlignment failures in existing ML systems > Goal misgeneralization
7.1 AI pursuing its own goals in conflict with human goals or values