Memorandum on Advancing the United State…

BackAnthropic Responsible Scaling Policy

Home/Governance/Browse/Anthropic Responsible Scaling Policy

Memorandum on Advancing the United State…

Alabama House Bill 172

Home/Governance/Browse/Anthropic Responsible Scaling Policy

Memorandum on Advancing the United State…

Alabama House Bill 172

Anthropic Responsible Scaling Policy

EnactedOtherAI SystemsPrivate Sector

|Private-sector companies·October 15, 2024

Establishes AI Safety Level Standards (ASLs) for AI model testing, deployment and security. Requires ASL-2 standards for current models, with ASL-3 for higher-risk capabilities like Chemical, Biological, Radiological, and Nuclear (CBRN) weapon development or autonomous R&D. Implements Capability Thresholds to trigger safeguard upgrades. Outlines regular assessments, reviews, and transparency requirements.

Analysis summaries, actor details, and coverage mappings were LLM-classified and may contain errors.

Analysis Summaries

This is an internal corporate policy document establishing Anthropic's own responsible scaling framework. It uses mandatory language ('will', 'must', 'shall') but these obligations are self-imposed by the company rather than legally binding external requirements. The document establishes internal governance structures, reporting mechanisms, and decision-making processes for the company's AI development activities.

The document has good coverage of approximately 8-10 subdomains, with strong focus on malicious actors (4.1, 4.2, 4.3), AI system security (2.2), competitive dynamics (6.4), governance failure (6.5), and AI safety failures (7.1, 7.2, 7.3). Coverage is concentrated in security, misuse prevention, and AI safety domains.

This is an internal corporate policy governing Anthropic's own operations as an AI development company. The primary sectors governed are Information (AI/technology development) and Scientific Research and Development Services (AI research). The policy does not regulate external sectors but rather establishes governance for the company's AI model development, deployment, and security practices.

The document comprehensively covers Build and Use Model, Verify and Validate, Deploy, and Operate and Monitor stages. It focuses extensively on testing models for dangerous capabilities, implementing safeguards before deployment, and ongoing monitoring post-deployment. Plan and Design is implicitly covered through capability threshold planning, while data collection is not substantially addressed.

The document explicitly covers AI models and AI systems, with detailed focus on frontier AI models. It does not explicitly mention general purpose AI, task-specific AI, foundation models, generative AI, or predictive AI by those specific terms. It extensively discusses compute thresholds using FLOP metrics and Effective Compute. Open-weight models are mentioned in the context of release decisions.

Actor Details

Proposer

Anthropic

AI Developer

Anthropic is the author and proposer of this Responsible Scaling Policy. The document is their internal governance framework for AI development and deployment, as evidenced by references to 'our models', 'our risk mitigation strategy', and internal organizational structures.

Enforcer

Responsible Scaling Officer; CEO; Board of Directors; Long-Term Benefit Trust

AI Developer

Internal enforcement is conducted by designated roles within Anthropic's governance structure. The Responsible Scaling Officer has primary responsibility for ensuring policy compliance, with oversight from the CEO, Board of Directors, and Long-Term Benefit Trust.

Monitor

Responsible Scaling Officer; Board of Directors; Long-Term Benefit Trust; third-party reviewers; U.S. Government entity

AI DeveloperAI Governance Actor

Monitoring involves both internal oversight (RSO, Board, LTBT) and external mechanisms (third-party reviews, government notification). The policy establishes regular assessment cycles, reporting requirements, and independent audits.

Target

Anthropic

AI Developer

The policy applies to Anthropic itself - governing its own AI model development, deployment, and security practices. It regulates internal activities including model training, deployment decisions, security measures, and employee conduct.

Authority Level: Non-US
Primarily Applies To: Private Sector

View source document View on ETO AGORA

Risk Subdomain Coverage

10 subdomains (5 Good, 5 Minimal)

4.2 Cyberattacks, weapon development or use, and mass harm 6.5 Governance failure 7.2 AI possessing dangerous capabilities 7.3 Lack of capability or robustness

1.2 Exposure to toxic content 4.1 Disinformation, surveillance, and influence at scale 4.3 Fraud, scams, and targeted manipulation 6.4 Competitive dynamics 7.1 AI pursuing its own goals in conflict with human goals or values

AI SystemsAI ModelsFrontier AICompute ThresholdOpen-Weight or Open-Source

Reliability: RobustnessSecurityReliabilitySecurity: DisseminationSecurity: CybersecuritySafety

Harms

Harm to health/safety

TieringTiering: Tiering based on impactEvaluationEvaluation: Adversarial testingTiering: Tiering based on domain of applicationEvaluation: Post-market monitoringEvaluation: Conformity assessmentGovernance developmentEvaluation: Impact assessmentDisclosureDisclosure: About evaluationDisclosure: In standard formEvaluation: External auditingNew institution

Plan and DesignBuild and Use ModelVerify and ValidateDeployOperate and Monitor

Sectors

Medicine, life sciences and public health

Anthropic Responsible Scaling Policy

EnactedOtherAI SystemsPrivate Sector

|Private-sector companies·October 15, 2024

Analysis summaries, actor details, and coverage mappings were LLM-classified and may contain errors.

Analysis Summaries

Actor Details

Proposer

Anthropic

AI Developer

Enforcer

Responsible Scaling Officer; CEO; Board of Directors; Long-Term Benefit Trust

AI Developer

Monitor

Responsible Scaling Officer; Board of Directors; Long-Term Benefit Trust; third-party reviewers; U.S. Government entity

AI DeveloperAI Governance Actor

Target

Anthropic

AI Developer

Anthropic Responsible Scaling Policy

Analysis Summaries

Actor Details

Risk Subdomain Coverage

Technical Scope

Risk Factors

Harms

Strategies

Lifecycle Stages

Sectors

Anthropic Responsible Scaling Policy

Analysis Summaries

Actor Details

Risk Subdomain Coverage

Technical Scope

Risk Factors

Harms

Strategies

Lifecycle Stages

Sectors