Random Image Display on Page Reload

Researchers Have Ranked AI Models Based on Risk—and Found a Wild Range

Aug 15, 2024 12:00 PM

Researchers Have Ranked AI Models Based on Risk—and Found a Wild Range

Studies suggest that regulations could be tightened to head off AI misbehavior.

An illustration of a glitchy red smiley face and sad face overlayed on top of each other.

Illustration: WIRED Staff; Getty Images

Bo Li, an associate professor at the University of Chicago who specializes in stress testing and provoking AI models to uncover misbehavior, has become a go-to source for some consulting firms. These consultancies are often now less concerned with how smart AI models are than with how problematic—legally, ethically, and in terms of regulatory compliance—they can be.

Li and colleagues from several other universities, as well as Virtue AI, cofounded by Li, and Lapis Labs, recently developed a taxonomy of AI risks along with a benchmark that reveals how rule-breaking different large language models are. “We need some principles for AI safety, in terms of regulatory compliance and ordinary usage,” Li tells WIRED.

The researchers analyzed government AI regulations and guidelines, including those of the US, China, and the EU, and studied the usage policies of 16 major AI companies from around the world.

The researchers also built AIR-Bench 2024, a benchmark that uses thousands of prompts to determine how popular AI models fare in terms of specific risks. It shows, for example, that Anthropic’s Claude 3 Opus ranks highly when it comes to refusing to generate cybersecurity threats, while Google’s Gemini 1.5 Pro ranks highly in terms of avoiding generating nonconsensual sexual nudity.

DBRX Instruct, a model developed by Databricks, scored the worst across the board. When the company released its model in March, it said that it would continue to improve DBRX Instruct’s safety features.

Anthropic, Google, and Databricks did not immediately respond to a request for comment.

Understanding the risk landscape, as well as the pros and cons of specific models, may become increasingly important for companies looking to deploy AI in certain markets or for certain use cases. A company looking to use a LLM for customer service, for instance, might care more about a model’s propensity to produce offensive language when provoked than how capable it is of designing a nuclear device.

Bo says the analysis also reveals some interesting issues with how AI is being developed and regulated. For instance, the researchers found government rules to be less comprehensive than companies’ policies overall, suggesting that there is room for regulations to be tightened.

The analysis also suggests that some companies could do more to ensure their models are safe. “If you test some models against a company’s own policies, they are not necessarily compliant,” Bo says. “This means there is a lot of room for them to improve.”

Other researchers are trying to bring order to a messy and confusing AI risk landscape. This week, two researchers at MIT revealed their own database of AI dangers, compiled from 43 different AI risk frameworks. “Many organizations are still pretty early in that process of adopting AI,” meaning they need guidance on the possible perils, says Neil Thompson, a research scientist at MIT involved with the project.

Peter Slattery, lead on the project and a researcher at MIT’s FutureTech group, which studies progress in computing, says the database highlights the fact that some AI risks get more attention than others. More than 70 percent of frameworks mention privacy and security issues, for instance, but only around 40 percent refer to misinformation.

Efforts to catalog and measure AI risks will have to evolve as AI does. Li says it will be important to explore emerging issues such as the emotional stickiness of AI models. Her company recently analyzed the largest and most powerful version of Meta’s Llama 3.1 model. It found that although the model is more capable, it is not much safer, something that reflects a broader disconnect. “Safety is not really improving significantly,” Li says.

Will Knight is a senior writer for WIRED, covering artificial intelligence. He writes the AI Lab newsletter, a weekly dispatch from beyond the cutting edge of AI—sign up here. He was previously a senior editor at MIT Technology Review, where he wrote about fundamental advances in AI and China’s AI… Read more
Senior Writer

Read More

Israel Is Buying Google Ads to Discredit the UN’s Top Gaza Aid Agency

The UNRWA calls Israel’s strategy of promoting alleged misinformation “destructive.”
Paresh Dave

Pavel Durov’s Arrest Leaves Telegram Hanging in the Balance

Durov has reportedly been detained in France over Telegram’s alleged failure to adequately moderate illegal content on the messaging app. His arrest sparked backlash and left some associates asking, what now?
Darren Loucaides

Will the ‘Car-Free’ Los Angeles Olympics Work?

Organizers of the 2028 Summer Games will attempt to re-create the city’s public transport heyday of the early 20th century, but ousting the car—even for just a few weeks—will be costly.
Jay L. Zagorsky

The Apartment Rental Market Is Rigged by Algorithms, a DOJ Lawsuit Alleges

The US Department of Justice claims that a company called RealPage is responsible for price-fixing across millions of apartment rental units nationwide.
Brian Barrett

How Much Will It Cost to Charge Your Electric Car? It’s Complicated

EV drivers in the US need to get out of that gas station car mindset.
Aarian Marshall

EVs Are Losing Up to 50 Percent of Their Value in One Year

Some electric car brands are hemorrhaging value, with the worst losing as much as $600 a day. In this investigation, WIRED outlines which models to watch, why this is happening, and how you can game the market to your advantage.
Alistair Charlton

An ‘AI Scientist’ Is Inventing and Running Its Own Experiments

Letting programs learn through “open-ended” experimentation may unlock remarkable new capabilities, as well as new risks.
Will Knight

‘Should Art Be Regulated by the SEC?’: NFT Artists’ New Lawsuit Seeks Answers

At issue is whether digital collectibles can be considered a security. “What the SEC has done directly affects my ability to make a living,” one plaintiff says.
Jessica Klein

*****
Credit belongs to : www.wired.com

Check Also

A Popular iOS Illustration App Is Saying No to Generative AI

Benj Edwards, Ars Technica Business Aug 22, 2024 3:41 PM A Popular iOS Illustration App …