Artificial intelligence company Anthropic is set to restore public access to its most powerful AI models, Claude Fable 5 and Mythos 5, weeks after they were pulled offline under a directive from the US government. 

Anthropic’s two latest models have been restricted from public access since June 12, when the government applied export controls following a report in which researchers bypassed Fable 5’s safeguards, forcing Anthropic to pull all access to the models immediately. The government lifted those restrictions on Wednesday, stated Anthropic. 

“After a series of productive conversations with the US government, we’re redeploying the model with a new set of classifiers to target and block more cybersecurity tasks,” Anthropic said

The suspension of the models raised concerns about state control over frontier AI technology and set a dangerous precedent, according to experts and technologists. The export controls also highlighted White House concerns about a potential national cybersecurity threat if these powerful models were jailbroken and used for malicious purposes. 

Getting best tech deployed remains priority 

US Secretary of Commerce Howard Lutnick said on X on Wednesday, “Over the past two weeks, we have worked closely with Anthropic to analyze and approve Fable 5 to ensure alignment across the US Government and to strengthen America’s leadership in AI.” 

Meanwhile, White House Chief of Staff Susie Wiles said on X that government priority remains to “get the best [AI] tech deployed as quickly and safely as possible.”

Related: AI researcher claims he’s already bypassed Anthropic’s Fable 5 guardrails

The restrictions came after the government became aware of a report in which Amazon researchers found a method of bypassing Fable 5’s safeguards, prompting the model to identify several software vulnerabilities. 

In a blog post, Anthropic argued this wasn’t a risk unique to Fable 5, as weaker models could also identify the same vulnerabilities and produce the same exploit.

AI jailbreak classifications proposed 

Anthropic has also begun drafting a consensus framework with Amazon, Microsoft, Google and other partners in its Project Glasswing — a collaboration announced in April to safeguard against AI cybersecurity threats — for “assessing the severity of AI jailbreaks.”

Anthropic’s cybersecurity safety classifiers and how jailbreaks interact with safety classifiers. Source: Anthropic

The company is also scaling up collaboration with the US government on AI model testing and safeguards. “This will include pre-release access to models and safeguards for evaluation, information sharing on jailbreaks and misuse, and dedicated resources for joint research,” it stated.

A well-known AI researcher claimed to have jailbroken Fable 5 within 48 hours of its launch in June, before the government restrictions, and shared screenshots showing how he bypassed the model’s safety guardrails.  

Magazine: AI is banking the unbanked in Africa… faster than crypto

Read the full article here

Share.

Leave A Reply