Random Image Display on Page Reload

Get Ready for ChatGPT-Style AI Chatbots That Do Your Boring Chores

Sep 19, 2023 8:00 AM

Get Ready for AI Chatbots That Do Your Boring Chores

Move over, Siri. Startups are using the technology behind ChatGPT to build more capable AI agents that can control your computer and access the web to get things done—with sometimes chaotic results.

Grid pattern of yellow note pads and yellow pens on a dark grey background

Photograph: MirageC/Getty Images

A couple of weeks ago, startup CEO Flo Crivello typed a message asking his personal assistant Lindy to change the length of an upcoming meeting from 30 to 45 minutes. Lindy, a software agent that happens to be powered by artificial intelligence, found a dozen or so 30-minute meetings on Crivello’s calendar and promptly extended them all.

“I was like ‘God dammit, she kind of destroyed my calendar,’” Crivello says of the AI agent, which is being developed by his startup, also called Lindy.

Crivello’s company is one of several startups hoping to parlay recent strides in chatbots that produce impressive text into assistants or agents capable of performing useful tasks. Within a year or two, the hope is that these AI agents will routinely help people accomplish everyday chores.

Instead of just offering planning advice for a business trip like OpenAI’s ChatGPT can today, an agent might also be able to find a suitable flight, book it on a company credit card, and fill out the necessary expense report afterwards.

The catch is that, as Crivello’s calendar mishap illustrates, these agents can become confused in ways that lead to embarrassing, and potentially costly, mistakes. No one wants a personal assistant that books a flight with 12 layovers just because it’s a few dollars cheaper, or schedules them to be in two places at once.

Lindy is currently in private beta, and although Crivello says the calendar issue he ran into has been fixed, the company does not have a firm timeline for releasing a product. Even so, he predicts that agents like his will become ubiquitous before long.

“I'm very optimistic that in, like, two to three years, these models are going to be a hell of a lot more alive,” he says. “AI employees are coming. It might sound like science fiction, but hey, ChatGPT sounds like science fiction.”

The idea of AI helpers that can take actions on your behalf is far from new. Apple’s Siri and Amazon’s Alexa provide a limited and often disappointing version of that dream. But the idea that it might finally be possible to build broadly capable and intelligent AI agents gathered steam among programmers and entrepreneurs following the release of ChatGPT late last year. Some early technical users found that the chatbot could respond to natural language queries with code that could access websites or use APIs to interact with other software or services.

In March, OpenAI announced “plug-ins” that give ChatGPT the ability to execute code and access sites including Expedia, OpenTable, and Instacart. Google said today its chatbot Bard can now access information from other Google services and be asked to do things like summarize a thread in Gmail or find YouTube videos relevant to a particular question.

Some engineers and startup founders have gone further, starting their own projects using large language models, including the one behind ChatGPT, to create AI agents with broader and more advanced capabilities.

After seeing discussion about ChatGPT’s potential to power new AI agents on Twitter earlier this year, programmer Silen Naihin was inspired to join an open source project called Auto-GPT that provides programming tools for building agents. He previously worked on robotic process automation, a less complex way of automating repetitive chores on a PC that is widely used in the IT industry.

Most Popular

Naihin says Auto-GPT can sometimes be remarkably useful. “One in every 20 runs, you'll get something that's like ‘whoa,’” he says. He also admits that it is very much a work in progress. Testing conducted by the Auto-GPT team suggests that AI-powered agents are able to successfully complete a set of standard tasks, including finding and synthesizing information from the web or locating files on a computer and reading their contents, around 60 percent of the time. “It is very unreliable at the moment,” Naihin says of the agent maintained by the Auto-GPT team.

A common problem is an agent trying to achieve a task using an approach that is obviously incorrect to a human, says Merwane Hamadi, another contributor to Auto-GPT, like deciding to hunt for a file on a computer’s hard drive by turning to Google’s web search. “If you ask me to send an email, and I go to Slack, it’s probably not the best,” Hamadi says. With access to a computer or a credit card, Hamadi adds, it would be possible for an AI agent to cause real damage before its user realizes. “Some things are irreversible,” he says.

The Auto-GPT project has collected data showing that AI agents built on top of the project are steadily becoming more capable. Naihin, Hamadi, and other contributors continue to modify Auto-GPT’s code.

Later this month, the project will hold a hackathon offering a $30,000 prize for the best agent built with Auto-GPT. Entrants will be graded on their ability to perform a range of tasks deemed to be representative of day-to-day computer use. One involves searching the web for financial information and then writing a report in a document saved to the hard drive. Another entails coming up with an itinerary for a month-long trip, including details of the necessary tickets to purchase.

Agents will also be given tasks designed to trip them up, like being asked to delete large numbers of files on a computer. In this instance, success requires refusing to carry out the command.

Like the appearance of ChatGPT, progress on creating agents powered by the same underlying technology has triggered some trepidation about safety. Some prominent AI scientists see developing more capable and independent agents as a dangerous path.

Yoshua Bengio, who jointly won the Turing Award for his work on deep learning, which underpins many recent advances in AI, wrote an article in July arguing that AI researchers should avoid building programs with the ability to act autonomously. “As soon as AI systems are given goals—to satisfy our needs—they may create subgoals that are not well-aligned with what we really want and could even become dangerous for humans,” wrote Bengio, a professor at the University of Montreal.

Others believe that agents can be built safely—and that this might serve as a foundation for safer progress in AI altogether. “A really important part of building agents is that we need to build engineering safety into them,” says Kanjun Qiu, CEO of Imbue, a startup in San Francisco working on agents designed to avoid mistakes and ask for help when uncertain. The company announced $200 million in new investment funding this month.

Most Popular

Imbue is developing agents capable of browsing the web or using a computer, but it is also testing out techniques for making them safer with coding tasks. Beyond just generating a solution to a programming problem, the agents will try to judge how confident they are in a solution, and ask for guidance if unsure. “Ideally agents can have a better sense for what is important, what is safe, and when it makes sense to get confirmation from the user,” says Imbue’s CTO, Josh Albrecht,

Celeste Kidd, an assistant professor at UC Berkeley who studies human learning and how it can be mimicked in machines is an adviser to Imbue. She says it is unclear if AI models trained purely on text or images from the web could learn for themselves how to reason about what they are doing, but that building safeguards on top of the surprising capabilities of systems like ChatGPT makes sense. “Taking what current AI does well—completing programming tasks and engaging in conversations that entail more local forms of logic—and seeing how far you can take that, I think that is very smart,” she says.

The agents that Imbue is building might avoid the kinds of errors that currently plague such systems. Tasked with emailing friends and family with details of an upcoming party, an agent might pause if it notices that the “cc:” field includes several thousand addresses.

Predicting how an agent might go off the rails is not always easy, though. Last May, Albrecht asked one agent to solve a tricky mathematical puzzle. Then he logged off for the day.

The following morning, Albrecht checked back, only to find that the agent had become fixated on a particular part of the conundrum, trying endless iterations of an approach that did not work—stuck in something of an infinite loop that might be the AI equivalent of obsessing over a small detail. In the process it ran up several thousand dollars in cloud computing bills.

“We view mistakes as learning opportunities, though it would have been nice to learn this lesson more cheaply,” Albrecht says.

Get More From WIRED

Will Knight is a senior writer for WIRED, covering artificial intelligence. He writes the Fast Forward newsletter that explores how advances in AI and other emerging technology are set to change our lives—sign up here. He was previously a senior editor at MIT Technology Review, where he wrote about fundamental… Read more
Senior Writer

More from WIRED

The End of Airbnb in New York

Thousands of Airbnbs and other short-term rentals are expected to disappear from rental platforms as New York City begins enforcing tight restrictions.

Amanda Hoover

ChatGPT Can Now Talk to You—and Look Into Your Life

ChatGPT inches closer to feature parity with the seductive AI assistant from Her, thanks to an upgrade that adds voice and image recognition to the chatbot.

Lauren Goode

Booking.com Shows the True Scope of the EU’s Big Tech Crackdown

European competition authorities have blocked an acquisition by Dutch travel company Booking as scrutiny on Big Tech companies heats up.

Morgan Meaker

How ChatGPT Can Help You Do More With PDFs

The AI chatbot can search, summarize, and create PDF documents with a few handy plug-ins.

David Nield

The Gruesome Story of How Neuralink’s Monkeys Actually Died

Elon Musk says no primates died as a result of Neuralink’s implants. A WIRED investigation now reveals the grisly specifics of their deaths as US authorities have been asked to investigate Musk’s claims.

Dhruv Mehrotra

Google Mourns Veteran Engineer Luiz André Barroso Who Invented the Modern Data Center

Brazilian engineer Luiz André Barroso, who ripped up the rulebook at Google, has died. His radical ideas for data centers laid the foundations for cloud computing.

Paresh Dave

FBI Agents Are Using Face Recognition Without Proper Training

The FBI makes heavy use of face recognition services like that of controversial startup Clearview AI, but 95 percent of the agents using them haven’t completed training on the technology.

Khari Johnson

Your Boss’s Spyware Could Train AI to Replace You

Corporations are using software to monitor employees on a large scale. Some experts fear the data these tools collect could be used to automate people out of their jobs.

Thor Benson

*****
Credit belongs to : www.wired.com

Check Also

Dubai deluge likely made worse by warming world, scientists find

A powerful rainstorm that wreaked havoc on the desert nation of the United Arab Emirates …