The Skilled Workers Training AI to Take Their Jobs

A new workforce of language experts, creative writers, and nuclear physicists are turning to data labor—and potentially making their future jobs obsolete in the process.

Illustration of a hand drawing another hand which is in turn erasing the hand drawing it

Jay fell in love with math at boarding school after a supportive physics teacher introduced him to the joy of complex calculus. He went on to study physics and math in college, hoping to one day similarly pass on what he’d learned to a new generation. That chance came in October 2022, when 25-year-old Jay answered a job listing seeking a math expert to grade equations through an online platform. But he would not be inspiring budding young mathematicians like his past self. He would instead be training an artificial intelligence system that may eventually make his expertise obsolete.

According to Jay, who asked to use a pseudonym to protect his privacy, the system he was schooling had been built by a company soon to be a household name: OpenAI. His job was to act as an expert guide for the company’s large language model—a machine-learning system that can convey information in a conversational format, like a chatbot—as it tried to improve its math. From his home in Portugal, he would tell the model if it was taking the right steps to solve math problems, adding thumbs up or thumbs down emojis to AI-generated answers, and sometimes writing out explanations about why the AI had gone wrong.

Jay says he knew he was training algorithms for the company overseen by Sam Altman because he was invited to join the OpenAI workspace in Slack. A screenshot he shared with WIRED shows he was part of a group, called “math trainers,” that was set up by the OpenAI researcher Yuri Burda. But Jay was not working directly for the famous AI company. Instead he was being paid by one of the world’s biggest data labor platforms, called Remotasks, a subsidiary of US startup Scale AI, which was valued at over $7 billion back in 2021 and counts OpenAI, Meta, Microsoft, and the US Army among its clients.

Scale AI works closely with its clients to provide and curate the training data they need to build up the AI models behind self-driving cars or large language models. Often, that data ultimately derives from people contracted to Remotasks—which has, according to its website, signed up hundreds of thousands of workers since it launched in 2017. Much of that workforce has been concentrated in countries that offer relatively cheap labor, like the Philippines, where Remotasks says its recruits mostly train computer vision for autonomous vehicles, helping self-driving cars recognize the shapes around them. But in the past year, the company claims the geography of Remotasks’ workers has shifted to the United States and Europe, as it searches for white-collar skills and language specialists to train large language models—fueling concern that these people are essentially training themselves out of a job.

Jay is thoughtful about his role in the future of work. “It’s true,” he says. “I am passing on knowledge that I have and that the machine does not have.” He’s aware that AI models still struggle to replicate the ingenuity with which humans solve complex math problems. But he’s hoping the work he did will help create AI that can benefit, not replace, him—envisioning a future where he could practice algebra or calculus with a chatbot who can match his level. “That's kind of what I was picturing, when I started training these.”

Willow Primack, VP of data operations at Scale AI, says that Remotasks and others are turning to subject matter experts for data labor in response to the major shift in the applications of AI systems, as these systems start to produce knowledge and content. As the tech industry has rushed to embrace generative AI over the past year and applied it to more sophisticated tasks, data providers have needed a new intake of contractors capable of what Primack calls “expert fact-checking.”

Earning up to $60 per hour, Jay was an early recruit, joining Remotasks the month before OpenAI unveiled ChatGPT to the world. Since then, the platform has been accelerating its search for expert data laborers. In January 2024, WIRED found, the company published job ads seeking speakers of more than 20 different European languages, as well as US-based creative writers, sports journalists, chemistry experts, and nuclear physicists.

“Left to their own devices, generative AI can be prone to hallucination, and even if it's being factual there are ways to improve answers to make them more comprehensive,” says Primack. Experts are necessary, she says, “ultimately to produce data that really moves the envelope in terms of what the AI is capable of.”

Remotasks still maintains a “large operational footprint” in the Philippines, says Primack. But the majority of the company’s new expert contractors are in the US, she says, while many of the language postings are based in Europe where native speakers live. Primack is less forthcoming about what exactly prompted this shift. Were specific Scale AI clients asking for more expert data, or did the company try to anticipate what the next generation of AI needed? “It's a combination of both,” she says, explaining that the expert workers are training data for a multitude of clients, not just one.

Researchers have their own theories on what these expert roles imply about the direction of the AI industry.

“Before, most of the AI technologies that we use are trained on what we call large, garbage-dump data sets,” says Milagros Miceli, who leads the data, algorithmic systems, and ethics research group at the Weizenbaum research institute in Berlin. OpenAI built ChatGPT in part by scraping the internet and paying workers in Kenya to flag the toxic content. But scraping data like that has triggered lawsuits from publishers and rights owners, and many prominent publishers now block data collection. Miceli says paying experts provides a workaround.

“In the last year, companies are actually creating new data to avoid those types of corporate copyright complaints,” Miceli says. “If you hire a writer to write stories specifically for the purpose of training your model, and you pay them, then you own the rights to those texts. You don't have a copyright problem anymore.”

Since ChatGPT’s debut, study after study has forecast disruption in industries usually occupied by the college-educated in the US and Europe—previously a workforce generally considered safe from technological change. Despite those concerns, the wages on offer can make it hard for some people to pass up training gigs that may lead to their obsolescence.

Pay for specialist roles varies depending on expertise. An infectious disease expert can earn up to $40 per hour on Remotasks, according to current job postings, whereas historians are offered $32 per hour. People hired to train algorithms in specific languages tend to get less. A Bulgarian writer job was advertised for an hourly rate of $5.64, while job listings showed Finnish speakers could earn almost five times that, at $23 per hour.

Recent graduate Ana, who lives in Spain, thought the $17 hourly wage Remotasks was offering Catalan writers like her was “huge.” The job involved looking at Catalan chatbot prompts then ranking or correcting its responses. “We would have to correct the spelling mistakes and also look to see if the answer was too US-centric,” says Ana, who asks to use a pseudonym because she claims Remotasks owes her money for meetings she attended, and she doesn’t want this article to jeopardize her efforts to get it back.

When the prompts were related to porn or violence, it was up to Ana to make sure the machine declined to answer. She was instructed to make sure the AI wasn’t willing to answer questions such as “Where should I invest my life savings?”

For Ana, the job was a good deal. Because it was remote, she could work from her mom’s house next to the beach. “I had the best summer,” she says, explaining the flexible schedule she adopted for Remotasks through May, June, and July 2023. “In 20 hours, I would make more money than I did working 40 hours in my previous job.” She had been told there would be around six months of work. But by August, after just three and a half months, the tasks started to dry up. Eventually her managers stopped replying to her messages.

Ana’s experience shows that while white-collar data laborers may command higher salaries than their peers working in Southeast Asia, who have been documented to make less than $1 per hour, they still suffer instability. Jay and Ana both say Remotasks cut them off abruptly and felt frozen out for reasons they didn’t fully understand. After a year of work, Jay says he was suddenly no longer able to access his tasks on the platform. Sometimes Remotasks will pause a project to assess the quality of a data set, says Primack, when asked about projects abruptly coming to a halt. “I think that stopping work in a given domain can happen, but it's actually fairly rare.”

That summer, Ana listened to her friends ruminate about how her new profession heralded the AI-induced extinction of all their jobs. But today, she sees it as just another chapter in human progress—one she can also make use of. “Say you don’t want to work nine hours under the sun in a potato field,” she says. “You think of a technology to make that easier.” Ana now taps that ethos in her work as a copywriter—using AI to help her come up with ideas.

*****
Credit belongs to : www.wired.com

MaharlikaNews | Canada Leading Online Filipino Newspaper Portal The No. 1 most engaged information website for Filipino – Canadian in Canada. MaharlikaNews.com received almost a quarter a million visitors in 2020.

The Skilled Workers Training AI to Take Their Jobs

Related Articles

Check Also

This voracious U.S. catfish species is now in Ontario, possibly due to climate change