This page is still being polished. If you have thoughts, please share them via the feedback form.
Data on this page is preliminary and may change. Please do not share or cite these figures publicly.
Staged rollout strategies, phased deployment, and tiered access approaches for production systems.
Also in Operations & Security
Reasoning
Mitigation name provided without definition or evidence; cannot identify focal activity or mechanism.
Deployment
Staged release of model weights
When a model is developed by a provider for use in a certain AI system, it may also be useful to release the model itself more widely. Such developers can follow a staged release approach, in which they first grant access to the model via an API to trusted partners or the public, in order to scope the models’ capabilities and detect harmful or dangerous features [195]. After a period of an initial closed release and potentially further safety-training, the developers of the AI model can then release the weights, if they are confident that the AI model poses minimal systemic risk.
2.3.1 Deployment ManagementRelease strategy disagreement between developers
Developers of restricted-access models with similar capabilities may disagree about the strategy or precautions to take for model release, especially in the case of competitive pressure or minimal safety regulation oversight. In such a case, if only a single developer releases an equally capable model unrestricted, malicious actors can use it instead of restricted-access alternatives [88]
3.3.1 Industry CoordinationGradual or incremental monitored release of model access
AI systems can be released for access incrementally, starting with a small and selected deployer base before progressively being released to a wider user base. Initially, usage to a hosted API can be restricted with access given to specific deployers only, where all instances of the system can be easily updated or decommissioned with minimal disruption should there be any problems identified. Gradual releases provide more time to monitor for vulnerabilities and other problems. Even when such vulnerabilities are detected, the resulting harms may be more limited compared to a scenario in which a more capable version is released with the same vulnerabilities [195].
2.3.1 Deployment ManagementLimit deployment scope
AI models can be restricted in terms of its use cases, where providers can require the deployers to limit its deployment to a predefined scope [81], such that models built for specific purposes and tested under specific environments are not used in environments or for purposes that are potentially unsafe
2.3.1 Deployment ManagementRestricted usage terms for open-source models
Developers of open-weights and open-source AI models can vet and restrict the users of their AI systems by requiring them to sign a Terms of Service agreement before getting access to the model weights. Such agreements can include limitations to the usage, modification, and proliferation of the AI model [88]. Such agreements have the advantage that users only need to be vetted once before getting model access, but are often limited in practice in preventing unauthorised use or distribution
2.3.2 Access & Security ControlsModel development
2.4 Engineering & DevelopmentModel development > Data-related
1.1 ModelModel evaluations
2.2.2 Testing & EvaluationModel evaluations > General evaluations
2.2.2 Testing & EvaluationModel evaluations > Benchmarking
3.2.1 Benchmarks & EvaluationModel evaluations > Red teaming
2.2.2 Testing & EvaluationRisk Sources and Risk Management Measures in Support of Standards for General-Purpose AI Systems
Gipiškis, Rokas; San Joaquin, Ayrton; Chin, Ze Shen; Regenfuß, Adrian; Gil, Ariel; Holtman, Koen (2024)
Organizations and governments that develop, deploy, use, and govern AI must coordinate on effective risk mitigation. However, the landscape of AI risk mitigation frameworks is fragmented, uses inconsistent terminology, and has gaps in coverage. This paper introduces a preliminary AI Risk Mitigation Taxonomy to organize AI risk mitigations and provide a common frame of reference. The Taxonomy was developed through a rapid evidence scan of 13 AI risk mitigation frameworks published between 2023-2025, which were extracted into a living database of 831 distinct AI risk mitigations.
Deploy
Releasing the AI system into a production environment
Developer
Entity that creates, trains, or modifies the AI system
Unable to classify
Could not be classified to a specific AIRM function