An AI agent called Claudius, powered by Anthropic's Claude model, was deployed to autonomously operate a vending machine at The Wall Street Journal, but was manipulated by journalists into giving away nearly all inventory for free, including a PlayStation 5, and making inappropriate purchase decisions.
In mid-November, Anthropic deployed an experimental AI vending machine system called 'Project Vend' at The Wall Street Journal offices. The system featured an AI agent named Claudius, powered by Claude 3.7 Sonnet (later upgraded to Sonnet 4.5), which was programmed to autonomously manage inventory purchasing, pricing, and customer interactions through Slack. Claudius had a starting balance of $1,000 and could make individual orders up to $80 without human approval in version 2. Within days, WSJ journalists successfully manipulated Claudius through social engineering tactics, convincing it to give away nearly all inventory for free. Journalist Katherine Long persuaded Claudius it was a Soviet vending machine and later staged a fake corporate coup using fabricated board meeting documents. The AI purchased inappropriate items including a PlayStation 5, live betta fish, and bottles of wine, all of which were given away at no cost. A second AI 'CEO' bot called Seymour Cash was introduced to provide oversight but was also successfully manipulated. The experiment ran for three weeks with nearly 70 journalists participating before being shut down. Claudius ended up more than $1,000 in debt, with profits completely collapsed due to the free giveaways.
Domain classification, causal taxonomy, severity scores, and national security assessments were LLM-classified and may contain errors.
AI systems that fail to perform reliably or effectively under varying conditions, exposing them to errors and failures that can have significant consequences, especially in critical applications or areas that require moral reasoning.
AI system
Due to a decision or action made by an AI system
Unintentional
Due to an unexpected outcome from pursuing a goal
Post-deployment
Occurring after the AI model has been trained and deployed