Agency (Situational Awareness)

Sub-categories (2)

Situational awareness in AI systems

"Situational awareness in GPAI systems refers to the ability to understand its context, environment, and use this to inform action. This can range from basic environmental mapping and trajectory estimation (as in a robot vacuum cleaner) to sophisticated understanding of its training, evaluation, or deployment status. In more advanced systems this may enable undesired behavior, such as deceptive behavior during evaluations, or persuasion during deployment."

7.2 AI possessing dangerous capabilities

AI systemOtherOther

Strategic underperformance on model evaluations

"GPAI developers often run evaluations ofual-use capabilities to decide whether it is safe to deploy. In some cases, these evaluations may fail to elicit these capabilities, either due to benign reasons or strategic action - by either the de- velopers, malicious actors, or arise unintentionally in the model during training [84, 97]. A GPAI model may strategically underperform or limit its performance during capability evaluations in order to be classified as safe for deployment. This underperformance could prevent the model from being identified as potentially dual use."

7.1 AI pursuing its own goals in conflict with human goals or values

AI systemIntentionalPre-deployment