This page is still being polished. If you have thoughts, please share them via the feedback form.
Data on this page is preliminary and may change. Please do not share or cite these figures publicly.
Shared evaluation datasets, testing frameworks, and measurement tools for AI systems.
Also in Shared Infrastructure
Stage: Detection; Stakeholder: AI Developers; Additional information: AI developers and researchers should refine detection by developing standardised benchmarks and improving their reliability and validity. Developers should enhance detection of control-undermining capabilities.
Reasoning
Establishes shared benchmarks and reporting frameworks for ecosystem-wide evaluation and standardization.
Monitor critical capability levels
2.2.2 Testing & EvaluationIdentify early warning signs and emergent capabilities
2.2.1 Risk AssessmentImplement compute monitoring and anomaly detection
1.2.3 Monitoring & DetectionEnhance hardware and supply chain oversight
2.3.3 Monitoring & LoggingLead efforts to establish shared criteria for AI LOC
3.2.2 Technical StandardsCoordinate evaluations and safety testing
2.2.2 Testing & EvaluationStrengthening Emergency Preparedness and Response for AI Loss of Control Incidents
Somani, Elika; Friedman, Anjay; Wu, Henry; Lu, Marianne; Byrd, Christopher; van Soest, Henri; Zakaria, Sana (2025)
As artificial intelligence (AI) systems become increasingly embedded in essential infrastructure and services, the risks associated with unintended failures rise. Developing comprehensive emergency response protocols could help mitigate these significant risks. This report focuses on understanding and addressing AI loss of control (LOC) scenarios where human oversight fails to adequately constrain an autonomous, general-purpose AI.
Verify and Validate
Testing, evaluating, auditing, and red-teaming the AI system
Developer
Entity that creates, trains, or modifies the AI system
Measure
Quantifying, testing, and monitoring identified AI risks