Random Image Display on Page Reload

Generative AI Has a ‘Shoplifting’ Problem. This Startup CEO Has a Plan to Fix It

Aug 8, 2024 6:30 AM

Generative AI Has a 'Shoplifting' Problem. This Startup CEO Has a Plan to Fix It

Bill Gross’ ProRata, which has struck deals with partners like Time and Universal Music Group, has a strategy for making AI powerhouses pay for content.

Image may contain People Person Body Part Finger Hand Accessories Glasses Adult Crowd Electronics and Mobile Phone

Photograph: Bloomberg/Getty Images

Bill Gross made his name in the tech world in the 1990s, when he came up with a novel way for search engines to make money on advertising. Under his pricing scheme, advertisers would pay when people clicked on their ads. Now, the “pay-per-click” guy has founded a startup called ProRata, which has an audacious, possibly pie-in-the-sky business model: “AI pay-per-use.”

Gross, who is CEO of the Pasadena, California, company, doesn’t mince words about the generative AI industry. “It’s stealing,” he says. “They’re shoplifting and laundering the world’s knowledge to their benefit.”

AI companies often argue that they need vast troves of data to create cutting-edge generative tools and that scraping data from the internet, whether it’s text from websites, video or captions from YouTube, or books pilfered from pirate libraries, is legally allowed. Gross doesn’t buy that argument. “I think it’s bullshit,” he says.

So do plenty of media executives, artists, writers, musicians, and other rights-holders who are pushing back—it’s hard to keep up with the constant flurry of copyright lawsuits filed against AI companies, alleging that the way they operate amounts to theft.

But Gross thinks ProRata offers a solution that beats legal battles. “To make it fair—that’s what I’m trying to do,” he says. “I don’t think this should be solved by lawsuits.”

His company aims to arrange revenue-sharing deals so publishers and individuals get paid when AI companies use their work. Gross explains it like this: “We can take the output of generative AI, whether it's text or an image or music or a movie, and break it down into the components, to figure out where they came from, and then give a percentage attribution to each copyright holder, and then pay them accordingly.” ProRata has filed patent applications for the algorithms it created to assign attribution and make the appropriate payments.

This week, the company, which has raised $25 million, launched with a number of big-name partners, including Universal Music Group, the Financial Times, The Atlantic, and media company Axel Springer. In addition, it has made deals with authors with large followings, including Tony Robbins, Neal Postman, and Scott Galloway. (It has also partnered with former White House communications director Anthony Scaramucci.)

Even journalism professor Jeff Jarvis, who believes scraping the web for AI training is fair use, has signed on. He tells WIRED that it’s smart for people in the news industry to band together to get AI companies access to “credible and current information” to include in their output. “I hope that ProRata might open discussion for what could turn into APIs [application programming interfaces] for various content,” he says.

Following the company’s initial announcement, Gross says he had a deluge of messages from other companies asking to sign up, including a text from Time CEO Jessica Sibley. ProRata secured a deal with Time, the publisher confirmed to WIRED. He plans to pursue agreements with high-profile YouTubers and other individual online stars.

The key word here is “plans.” The company is still in its very early days, and Gross is talking a big game. As a proof of concept, ProRata is launching its own subscription chatbot-style search engine in October. Unlike other AI search products, ProRata’s search tool will exclusively use licensed data. There’s nothing scraped using a web crawler. “Nothing from Reddit,” he says.

Ed Newton-Rex, a former Stability AI executive who now runs the ethical data licensing nonprofit Fairly Trained, is heartened by ProRata’s debut. “It’s great to see a generative AI company licensing training data before releasing their model, in contrast to many other companies’ approach,” he says. “The deals they have in place further demonstrate media companies’ openness to working with good actors.”

Gross wants the search engine to demonstrate that quality of data is more important than quantity and believes that limiting the model to trustworthy information sources will curb hallucinations. “I’m claiming that 70 million good documents is actually superior to 70 billion bad documents,” he says. “It’s going to lead to better answers.”

What’s more, Gross thinks he can get enough people to sign up for this all-licensed-data AI search engine to make as much money needed to pay its data providers their allotted share. “Every month the partners will get a statement from us saying, ‘Here’s what people search for, here's how your content was used, and here's your pro rata check,’” he says.

Other startups already are jostling for prominence in this new world of training-data licensing, like the marketplaces TollBit and Human Native AI. A nonprofit called the Dataset Providers Alliance was formed earlier this summer to push for more standards in licensing; founding members include services like the Global Copyright Exchange and Datarade.

ProRata’s business model hinges in part on its plan to license its attribution and payment technologies to other companies, including major AI players. Some of those companies have begun striking their own deals with publishers. (The Atlantic and Axel Springer, for instance, have agreements with OpenAI.) Gross hopes that AI companies will find licensing ProRata’s models more affordable than creating them in-house.

“I’ll license the system to anyone who wants to use it,” Gross says. “I want to make it so cheap that it’s like a Visa or Mastercard fee.”

Kate Knibbs is a senior writer at WIRED, covering the human side of the generative AI boom and how new tech shapes the arts, entertainment, and media industries. Prior to joining WIRED she was a features writer at The Ringer and a senior writer at Gizmodo. She is based in… Read more
Senior Writer

Read More

SearchGPT Is OpenAI’s Direct Assault on Google

The company behind ChatGPT is expanding into search, and leaning heavily on its relationships with publishers.
Reece Rogers

Steve Jobs Knew the Moment the Future Had Arrived. It's Calling Again

In a recently uncovered 1983 speech, Steve Jobs predicted the ubiquity of personal computers. In 2024, AI has brought us to another inflection point.
Steven Levy

Meta's New Llama 3.1 AI Model Is Free, Powerful, and Risky

The newest version of Llama will make AI more accessible and customizable, but it will also stir up debate over the dangers of releasing AI without guardrails.
Will Knight

A New Trick Could Block the Misuse of Open Source AI

Researchers have developed a way to tamperproof open source large language models to prevent them from being coaxed into, say, explaining how to make a bomb.
Will Knight

Instagram Will Let You Make Custom AI Chatbots—Even Ones Based on Yourself

Meta’s AI Studio will let users build virtual characters, with a few limitations.
Will Knight

Elon Musk’s X Is Leaving San Francisco

The office closure marks the end of an era for the site formerly known as Twitter.
Lauren Goode

Trump's Crypto Embrace Could Be a Disaster for Bitcoin

At the Bitcoin 2024 conference in Nashville, Donald Trump promised the crypto community the moon. They'd better hope they don't get it.
Steven Levy

California Supreme Court Rules That Uber and Lyft Drivers Will Remain Independent Contractors

In a major victory for app-based companies, their drivers will not be considered employees.
Aarian Marshall

*****
Credit belongs to : www.wired.com

Check Also

OpenAI’s o3-Mini Is a Leaner AI Model That Keeps Pace With DeepSeek

Will Knight Business Jan 31, 2025 2:27 PM OpenAI’s o3-Mini Is a Leaner AI Model …