Skip to main content
Home/Risks/SAIL & Concordia AI (2025)/Active loss of control

Active loss of control

Frontier AI Risk Management Framework (v1.0)

SAIL & Concordia AI (2025)

Sub-category
Risk Domain

AI systems acting in conflict with human goals or values, especially the goals of designers or users, or ethical standards. These misaligned behaviors may be introduced by humans during design and development, such as through reward hacking and goal misgeneralisation, or may result from AI using dangerous capabilities such as manipulation, deception, situational awareness to seek power, self-proliferate, or achieve other goals.

"...where AI systems behave in ways that actively undermine human control, such as obscuring their activities or resisting shutdown attempts. Active loss of control scenarios involve AI systems that may escape human regulatory oversight, autonomously acquire external resources, engage in self-replication, develop instrumental goals contrary to human ethics and morality, seek external power, and compete with humans for control."(p. 7)

Supporting Evidence (2)

1.
"Active loss of control risk could emerge from the complex interplay between model capabilities, model propensities and deployment conditions listed in Appendix III: List of frontier model capabilities, propensities, and characteristics. These scenarios could be enabled by the development of control-undermining capabilities (such as, autonomous planning, strategic deception, and self-modification), and the tendency to employ these control-undermining capabilities to evade human supervision and control mechanisms in certain deployment conditions."(p. 7)
2.
"Hypothetical threat scenarios include but not limited to ● Uncontrolled autonomous AI research and development20, where AI systems recursively improve their capabilities without human oversight or authorization; ● Rogue autonomous replication21, where AI systems independently acquire computational resources, create copies of themselves, and establish persistent presence across multiple platforms; ● Strategic deception22 by AI systems to avoid shutdown or oversight while pursuing objectives that conflict with human values."(p. 7)

Other risks from SAIL & Concordia AI (2025) (36)