Loss of control
AI systems acting in conflict with human goals or values, especially the goals of designers or users, or ethical standards. These misaligned behaviors may be introduced by humans during design and development, such as through reward hacking and goal misgeneralisation, or may result from AI using dangerous capabilities such as manipulation, deception, situational awareness to seek power, self-proliferate, or achieve other goals.
"‘Loss of control’ scenarios are hypothetical future scenarios in which one or more general- purpose AI systems come to operate outside of anyone’s control, with no clear path to regaining control. These scenarios vary in their severity, but some experts give credence to outcomes as severe as the marginalisation or extinction of humanity."(p. 100)
Supporting Evidence (2)
"Two key requirements for commonly discussed loss of control scenarios are a. markedly increased AI capabilities and b. the use of those capabilities in ways that undermine control. First, some future AI systems would need specific capabilities (significantly surpassing those of current systems) that allow them to undermine human control. Second, some AI systems would need to employ these 'control- undermining capabilities', either because they were intentionally designed to do so or because technical issues produce unintended behaviour."(p. 100)
"There are multiple versions of loss of control concerns, including versions that emphasise ‘passive’ loss of control (see Figure 2.5). In ‘passive’ loss of control scenarios, important decisions are delegated to AI systems, but the systems’ decisions are too opaque, complex, or fast to allow for or incentivise meaningful oversight. Alternatively, people stop exercising oversight because they strongly trust the systems’ decisions and are not required to exercise oversight (585, 589). These concerns are partly grounded in the ‘automation bias’ literature, which reports many cases of people complacently relying on recommendations from automated systems (590, 591)."(p. 101)
Other risks from Bengio2025 (13)
Risks from malicious use
4.0 Malicious Actors & MisuseRisks from malicious use > Harm to individuals through fake content
4.3 Fraud, scams, and targeted manipulationRisks from malicious use > Manipulation of public opinion
4.1 Disinformation, surveillance, and influence at scaleRisks from malicious use > Cyber offence
4.2 Cyberattacks, weapon development or use, and mass harmRisks from malicious use > Biological and chemical attacks
4.2 Cyberattacks, weapon development or use, and mass harmReliability issues
7.3 Lack of capability or robustness