Skip to main content
Home/Risks/Bengio2025/Loss of control

Loss of control

Sub-category
Risk Domain

AI systems acting in conflict with human goals or values, especially the goals of designers or users, or ethical standards. These misaligned behaviors may be introduced by humans during design and development, such as through reward hacking and goal misgeneralisation, or may result from AI using dangerous capabilities such as manipulation, deception, situational awareness to seek power, self-proliferate, or achieve other goals.

"‘Loss of control’ scenarios are hypothetical future scenarios in which one or more general- purpose AI systems come to operate outside of anyone’s control, with no clear path to regaining control. These scenarios vary in their severity, but some experts give credence to outcomes as severe as the marginalisation or extinction of humanity."(p. 100)

Supporting Evidence (2)

1.
"Two key requirements for commonly discussed loss of control scenarios are a. markedly increased AI capabilities and b. the use of those capabilities in ways that undermine control. First, some future AI systems would need specific capabilities (significantly surpassing those of current systems) that allow them to undermine human control. Second, some AI systems would need to employ these 'control- undermining capabilities', either because they were intentionally designed to do so or because technical issues produce unintended behaviour."(p. 100)
2.
"There are multiple versions of loss of control concerns, including versions that emphasise ‘passive’ loss of control (see Figure 2.5). In ‘passive’ loss of control scenarios, important decisions are delegated to AI systems, but the systems’ decisions are too opaque, complex, or fast to allow for or incentivise meaningful oversight. Alternatively, people stop exercising oversight because they strongly trust the systems’ decisions and are not required to exercise oversight (585, 589). These concerns are partly grounded in the ‘automation bias’ literature, which reports many cases of people complacently relying on recommendations from automated systems (590, 591)."(p. 101)

Other risks from Bengio2025 (13)