Get Ready for ChatGPT-Style AI Chatbots That Do Your Boring Chores

Sep 19, 2023 8:00 AM

Get Ready for AI Chatbots That Do Your Boring Chores

Move over, Siri. Startups are using the technology behind ChatGPT to build more capable AI agents that can control your computer and access the web to get things done—with sometimes chaotic results.

Grid pattern of yellow note pads and yellow pens on a dark grey background

A couple of weeks ago, startup CEO Flo Crivello typed a message asking his personal assistant Lindy to change the length of an upcoming meeting from 30 to 45 minutes. Lindy, a software agent that happens to be powered by artificial intelligence, found a dozen or so 30-minute meetings on Crivello’s calendar and promptly extended them all.

“I was like ‘God dammit, she kind of destroyed my calendar,’” Crivello says of the AI agent, which is being developed by his startup, also called Lindy.

Crivello’s company is one of several startups hoping to parlay recent strides in chatbots that produce impressive text into assistants or agents capable of performing useful tasks. Within a year or two, the hope is that these AI agents will routinely help people accomplish everyday chores.

Instead of just offering planning advice for a business trip like OpenAI’s ChatGPT can today, an agent might also be able to find a suitable flight, book it on a company credit card, and fill out the necessary expense report afterwards.

The catch is that, as Crivello’s calendar mishap illustrates, these agents can become confused in ways that lead to embarrassing, and potentially costly, mistakes. No one wants a personal assistant that books a flight with 12 layovers just because it’s a few dollars cheaper, or schedules them to be in two places at once.

Lindy is currently in private beta, and although Crivello says the calendar issue he ran into has been fixed, the company does not have a firm timeline for releasing a product. Even so, he predicts that agents like his will become ubiquitous before long.

“I'm very optimistic that in, like, two to three years, these models are going to be a hell of a lot more alive,” he says. “AI employees are coming. It might sound like science fiction, but hey, ChatGPT sounds like science fiction.”

The idea of AI helpers that can take actions on your behalf is far from new. Apple’s Siri and Amazon’s Alexa provide a limited and often disappointing version of that dream. But the idea that it might finally be possible to build broadly capable and intelligent AI agents gathered steam among programmers and entrepreneurs following the release of ChatGPT late last year. Some early technical users found that the chatbot could respond to natural language queries with code that could access websites or use APIs to interact with other software or services.

In March, OpenAI announced “plug-ins” that give ChatGPT the ability to execute code and access sites including Expedia, OpenTable, and Instacart. Google said today its chatbot Bard can now access information from other Google services and be asked to do things like summarize a thread in Gmail or find YouTube videos relevant to a particular question.

Some engineers and startup founders have gone further, starting their own projects using large language models, including the one behind ChatGPT, to create AI agents with broader and more advanced capabilities.

After seeing discussion about ChatGPT’s potential to power new AI agents on Twitter earlier this year, programmer Silen Naihin was inspired to join an open source project called Auto-GPT that provides programming tools for building agents. He previously worked on robotic process automation, a less complex way of automating repetitive chores on a PC that is widely used in the IT industry.

Most Popular

Naihin says Auto-GPT can sometimes be remarkably useful. “One in every 20 runs, you'll get something that's like ‘whoa,’” he says. He also admits that it is very much a work in progress. Testing conducted by the Auto-GPT team suggests that AI-powered agents are able to successfully complete a set of standard tasks, including finding and synthesizing information from the web or locating files on a computer and reading their contents, around 60 percent of the time. “It is very unreliable at the moment,” Naihin says of the agent maintained by the Auto-GPT team.

A common problem is an agent trying to achieve a task using an approach that is obviously incorrect to a human, says Merwane Hamadi, another contributor to Auto-GPT, like deciding to hunt for a file on a computer’s hard drive by turning to Google’s web search. “If you ask me to send an email, and I go to Slack, it’s probably not the best,” Hamadi says. With access to a computer or a credit card, Hamadi adds, it would be possible for an AI agent to cause real damage before its user realizes. “Some things are irreversible,” he says.

The Auto-GPT project has collected data showing that AI agents built on top of the project are steadily becoming more capable. Naihin, Hamadi, and other contributors continue to modify Auto-GPT’s code.

Later this month, the project will hold a hackathon offering a $30,000 prize for the best agent built with Auto-GPT. Entrants will be graded on their ability to perform a range of tasks deemed to be representative of day-to-day computer use. One involves searching the web for financial information and then writing a report in a document saved to the hard drive. Another entails coming up with an itinerary for a month-long trip, including details of the necessary tickets to purchase.

Agents will also be given tasks designed to trip them up, like being asked to delete large numbers of files on a computer. In this instance, success requires refusing to carry out the command.

Like the appearance of ChatGPT, progress on creating agents powered by the same underlying technology has triggered some trepidation about safety. Some prominent AI scientists see developing more capable and independent agents as a dangerous path.

Yoshua Bengio, who jointly won the Turing Award for his work on deep learning, which underpins many recent advances in AI, wrote an article in July arguing that AI researchers should avoid building programs with the ability to act autonomously. “As soon as AI systems are given goals—to satisfy our needs—they may create subgoals that are not well-aligned with what we really want and could even become dangerous for humans,” wrote Bengio, a professor at the University of Montreal.

Others believe that agents can be built safely—and that this might serve as a foundation for safer progress in AI altogether. “A really important part of building agents is that we need to build engineering safety into them,” says Kanjun Qiu, CEO of Imbue, a startup in San Francisco working on agents designed to avoid mistakes and ask for help when uncertain. The company announced $200 million in new investment funding this month.

Get More From WIRED

Will Knight is a senior writer for WIRED, covering artificial intelligence. He writes the Fast Forward newsletter that explores how advances in AI and other emerging technology are set to change our lives—sign up here. He was previously a senior editor at MIT Technology Review, where he wrote about fundamental… Read more

Senior Writer