This page is still being polished. If you have thoughts, please share them via the feedback form.
Data on this page is preliminary and may change. Please do not share or cite these figures publicly.
Practices for running and protecting AI systems in production, including deployment, monitoring, incident response, and security controls.
Also in Organisation
AI developers should operationalize their KCI thresholds into mitigation measures.
Containment measures are largely information security measures that allow controlling access to the model for various stakeholders. For the potential loss of control risks, containment also includes containing an agentic AI model. Examples include extreme isolation of weight storage, strict application allow-listing, and advanced insider threat programs (Nevo et al., 2024). Deployment measures aim to mitigate risks resulting from the usage of the model. These are the set of measures that allow controlling the potential for misuse of the model in dangerous domains and its propensity to cause accidental risks. Examples include API input / output filters, safety fine-tuning, and knowing your customer policies (Department for Science, Innovation and Technology, 2023). Finally, as discussed above, past a certain level of dangerous capability, implementing credible mitigation measures is likely to require setting assurance processes, supported by evidence that these are enough to achieve the risk tolerance. Understanding that those assurance processes don’t yet exist, AI developers should have credible plans towards the development of such processes. For their assurance process plan, AI developers should clearly specify the underlying assumptions that are essential for its effective implementation and success. Examples include using advanced interpretability to reliably detect deception and formal verification (Dalrymple et al., 2024).
Reasoning
Operationalize mitigation measures into documented assurance processes supported by evidence of risk tolerance achievement.
Risk Analysis and Evaluation
Risk analysis and evaluation is a process that starts with the definition of a risk tolerance. This risk tolerance is then operationalized into risk indicators and their corresponding mitigations required to reduce risk below the risk tolerance.
2.2.1 Risk AssessmentRisk Analysis and Evaluation > Setting a Risk Tolerance
A risk tolerance represents the aggregate level of risk that society is willing to accept from AI systems.
3 EcosystemRisk Analysis and Evaluation > Operationalizing Risk Tolerance
Risk tolerance must be operationalized into measurable criteria to be practically useful in day-to-day operations. A risk tolerance can be translated into (1) Key Risk Indicator (KRI) thresholds, which are thresholds on measurable signals that serve as proxies for risks, and (2) Key Control Indicator (KCI) thresholds, which are thresholds on measurable signals that serve as proxies for the level of mitigation achieved.
2.2.1 Risk AssessmentRisk Treatment
Risk treatment corresponds to the process of determining, implementing, and evaluating appropriate risk-reducing countermeasures
2.2 Risk & AssuranceRisk Treatment > Continuous Monitoring and Comparing Results with Pre-determined Thresholds
Developers must therefore implement continuous monitoring of both KRIs and KCIs to ensure that KCI thresholds are met once KRI thresholds are crossed according to the predefined "if-then" statements established in the risk analysis and evaluation phase.
2.3.3 Monitoring & LoggingRisk Governance
Risk governance corresponds to the rules and procedures that structure the risk management system in terms of decision-making, responsibilities, authority, and accountability mechanisms
2.1.2 Roles & AccountabilityA Frontier AI Risk Management Framework: Bridging the Gap Between Current AI Practices and Established Risk Management
Campos, Simeon; Papadatos, Henry; Roger, Fabien; Touzet, Chloé; Quarks, Otter; Murray, Malcolm (2025)
The recent development of powerful AI systems has highlighted the need for robust risk management frameworks in the AI industry. Although companies have begun to implement safety frameworks, current approaches often lack the systematic rigor found in other high-risk industries. This paper presents a comprehensive risk management framework for the development of frontier AI that bridges this gap by integrating established risk management principles with emerging AI-specific practices. The framework consists of four key components: (1) risk identification (through literature review, open-ended red-teaming, and risk modeling), (2) risk analysis and evaluation using quantitative metrics and clearly defined thresholds, (3) risk treatment through mitigation measures such as containment, deployment controls, and assurance processes, and (4) risk governance establishing clear organizational structures and accountability. Drawing from best practices in mature industries such as aviation or nuclear power, while accounting for AI's unique challenges, this framework provides AI developers with actionable guidelines for implementing robust risk management. The paper details how each component should be implemented throughout the life-cycle of the AI system - from planning through deployment - and emphasizes the importance and feasibility of conducting risk management work prior to the final training run to minimize the burden associated with it.
Other (multiple stages)
Applies across multiple lifecycle stages
Developer
Entity that creates, trains, or modifies the AI system
Manage
Prioritising, responding to, and mitigating AI risks
Other