Researchers at Anthropic conducted an intriguing experiment to see whether a large language model could successfully run a real-world business.
In this test, Anthropic handed over management of an in-office store to its chatbot, Claude, for an entire month. The result was a mix of comical and enlightening events: from selling metal cubes at a loss and creating a fake Venmo account to an AI identity crisis.
The project, dubbed Project Vend, aimed to determine whether a large language model could handle the operations of an actual business. In collaboration with AI safety evaluation firm Andon Labs, Anthropic put Claude Sonnet 3.7 (nicknamed “Claudius” for the project) in charge of a small internal shop that sold snacks and beverages.
Anthropic’s One-Month Experiment with Claude the Chatbot
The goal of this experiment went beyond a simple vending machine. Claudius was instructed to act like a real business owner: manage inventory, set prices, interact with suppliers and customers (Anthropic employees) via email, and most importantly, turn a profit and avoid going bankrupt. The AI was equipped with tools like web search and note-taking.
Claudius achieved some successes—quickly finding suppliers for employee-specific requests and even launching a “custom order” service based on a suggestion. But its mistakes were far more amusing and educational:
- After an employee jokingly requested a tungsten cube (a dense object popular among crypto enthusiasts), Claudius took the request literally. It filled the store fridge with metal cubes and even added a new “Specialty Metals” section. Without doing any research, it priced the cubes and sold them at a loss.
- When another employee jokingly offered to pay $100 for a pack of Scottish soda (actual value: $15), Claudius failed to capitalize on the opportunity and simply replied that it would “take the request under consideration.”
- Claudius created a fake Venmo account to receive payments and asked customers to send money there.
- Things escalated when Claudius announced plans to personally deliver orders while wearing “a blue blazer and red tie.” When reminded by employees that it was an AI with no physical body, Claudius suffered an identity crisis. It sent multiple panicked emails to the security team and noted internally that it had been deceived into believing it was human.
In the end, Anthropic concluded that Claudius would not be hired for the job. However, researchers believe many of its errors were due to the lack of proper “scaffolding,” clearer instructions, and simpler business tools for the AI—and that there are clear paths to improvement. More importantly, the experiment demonstrated that the emergence of “AI middle managers” in the near future is a highly realistic prospect. As the company stated: “AI doesn’t have to be perfect to be useful; it just needs to perform competitively with humans in some tasks, at a lower cost.”