Home Artificial Intelligence (AI)Anthropic updates AI safety policy, separates company commitments from industry recommendations

Anthropic updates AI safety policy, separates company commitments from industry recommendations

by HR News Canada Staff
A+A-
Reset

Anthropic has released the third version of its Responsible Scaling Policy (RSP), restructuring how the artificial intelligence company manages and communicates risks from increasingly capable AI systems — changes that have direct implications for businesses using or considering AI tools in the workplace.

The updated policy, published Feb. 24, 2026, introduces a clearer split between what Anthropic will do on its own and what it believes the broader AI industry needs to do collectively to manage risk.

What changed and why

The original RSP, launched in September 2023, used a tiered system of “AI Safety Levels” (ASL) to set safeguard requirements triggered when models reached certain capability thresholds. Anthropic says the framework worked in some areas but fell short in others.

The company says its internal safeguards did improve under the policy, and that other major AI developers — including OpenAI and Google DeepMind — adopted broadly similar frameworks within months of Anthropic’s original announcement. Governments in California, New York, and the European Union have since introduced legislation requiring frontier AI developers to publish risk-management frameworks, an outcome Anthropic says aligns with the RSP’s original intent.

However, the company acknowledges the policy hit structural limits. Determining when a model has actually crossed a risk threshold proved harder than expected, government action on AI safety has moved slowly, and some higher-level safeguards may be impossible for a single company to implement without government or industry co-operation.

Three new elements

The revised RSP introduces three core changes.

First, the policy now formally separates Anthropic’s own commitments from its recommendations for the AI industry as a whole. The company says this distinction reflects the reality that some safeguards require collective action to be effective.

Second, Anthropic will publish a Frontier Safety Roadmap outlining concrete goals across security, model alignment, safeguards, and policy. The company describes these as ambitious but achievable targets it will report on publicly, rather than hard commitments. Stated goals include launching research projects on advanced information security, developing automated red-teaming methods to stress-test AI systems, and publishing policy proposals for a “regulatory ladder” that scales with risk levels.

Third, the company will publish Risk Reports every three to six months. These reports will describe model capabilities, specific threat scenarios, active risk controls, and an overall risk assessment. External reviewers with AI safety expertise will be brought in to scrutinize the reports under certain conditions, and Anthropic says it is already piloting that process.

Why it matters for employers

As AI tools become more common in HR functions — from recruitment screening to performance management and workforce planning — the safety and governance frameworks behind those tools carry growing relevance for employers.

Anthropic’s updated policy signals that even the developers of frontier AI systems consider some risks not yet fully resolved, particularly in areas such as biosecurity and model behaviour. For HR leaders and C-suite executives assessing AI adoption, vendor transparency on risk governance is becoming a practical due-diligence consideration.

The company says it will continue revising the RSP as AI capabilities evolve.

Related Posts

Leave a Comment