Problems with ChatGPT Agent: An Assistant That Takes an Hour to Order a Cake

The ChatGPT Agent assistant is facing some odd issues: it needs the user’s permission for nearly every task, makes glaring mistakes, and operates very slowly.

OpenAI recently unveiled its latest intelligent agent called ChatGPT Agent—a tool that combines research and execution capabilities, promising to handle complex tasks on behalf of the user. However, this new assistant comes with strange limitations: it requires human approval for every significant task, and in early tests, it demonstrated major errors and sluggish performance, showing it still has a long way to go before achieving true autonomy.

OpenAI has integrated two of its previous smart agents—Operator (for performing actions in a browser) and Deep Research (for multi-step investigations)—into a new, unified tool directly within the ChatGPT interface. This intelligent agent uses a “virtual computer” to carry out tasks like checking calendars, planning trips, shopping online, or generating analytical reports.

The Strange Issues of ChatGPT Agent

Despite its impressive features, a fundamental limitation has raised questions about the tool’s usefulness. According to OpenAI’s official announcement, the intelligent agent “asks for user permission before performing any important action.” This means you can’t simply assign it a task and walk away. For every critical step—from booking a ticket to making an online purchase—your presence and final approval are required.

According to Wired’s analysis, this dual approach reveals an important truth:

  • From a safety perspective: It’s a smart move. Given that AI is prone to making mistakes or falling victim to cyberattacks (like prompt injection), human oversight helps prevent financial or security disasters.
  • From a usability standpoint: This constant need for human intervention undermines the core purpose of an automation tool. The agent is stuck in a strange limbo: it’s too powerful to be left alone, yet too unreliable to handle tasks independently.

Its performance in early tests hasn’t been promising either; in one test, experts reported that ordering a few cupcakes using the agent took nearly an hour.

Moreover, in the product’s promotional video, when the ChatGPT Agent is asked to plan a trip to visit all Major League Baseball stadiums in the U.S., it generates a map with one stop placed in the middle of the Gulf of Mexico. These glaring mistakes—which were even overlooked by the presenters themselves—highlight that this technology still has a long way to go before reaching an acceptable level of reliability and efficiency.

This new capability is initially being released for Pro subscribers, with a limit of 400 prompts per month. Users on Plus and Team plans will also gain access soon, though with a much stricter cap—just one-tenth of the Pro limit. No release date has been announced yet for free users.