Close Menu
    • Home
    • Events
      • Upcoming Events
      • Videos
        • Machine Can Think Summit 2026
        • Step Dubai Conference 2026
    • Technology & Innovation
    • Business & Marketing
    • Trends & Insights
    • Industry Applications
    • Tutorials & Guides
    What's Hot
    Business & Marketing

    eBay Q2 Revenue Forecast AI Driving Marketplace Success

    By Art RyanApril 30, 20260

    eBay is on track for a strong year with Q2 revenue expected to beat analysts’…

    Pirelli AI Tyre Technology: Revolutionizing Mobility

    April 30, 2026

    Microsoft Cloud Growth AI: Azure Revenue Surge

    April 30, 2026

    Amazon Surprises Investors As Artificial Intelligence Demand Booms

    April 30, 2026
    Facebook X (Twitter) Instagram
    Facebook X (Twitter) Instagram
    Breaking AI News
    Thursday, April 30
    • Home
    • Events
      • Upcoming Events
      • Videos
        • Machine Can Think Summit 2026
        • Step Dubai Conference 2026
    • Technology & Innovation

      Pirelli AI Tyre Technology: Revolutionizing Mobility

      April 30, 2026

      Pentagon Google AI Deal: Transforming Defense Technology

      April 30, 2026

      SAS Puts AI Governance at the Core of Its Agent Strategy

      April 29, 2026

      Amazon AI Hiring Software Enhances Recruitment Efficiency

      April 29, 2026

      AI Drug Development Johnson & Johnson Impact on Healthcare

      April 28, 2026
    • Business & Marketing

      eBay Q2 Revenue Forecast AI Driving Marketplace Success

      April 30, 2026

      Microsoft Cloud Growth AI: Azure Revenue Surge

      April 30, 2026

      Amazon Surprises Investors As Artificial Intelligence Demand Booms

      April 30, 2026

      Alphabet AI Cloud Revenue Growth Surpasses Expectations

      April 30, 2026

      Big Tech AI Spending 2026: Investment Trends Revealed

      April 29, 2026
    • Trends & Insights

      eBay Q2 Revenue Forecast AI Driving Marketplace Success

      April 30, 2026

      Amazon Surprises Investors As Artificial Intelligence Demand Booms

      April 30, 2026

      SAS Puts AI Governance at the Core of Its Agent Strategy

      April 29, 2026

      Big Tech AI Spending 2026: Investment Trends Revealed

      April 29, 2026

      Oracle & CoreWeave Shares Fall on OpenAI Growth Miss

      April 29, 2026
    • Industry Applications

      Pirelli AI Tyre Technology: Revolutionizing Mobility

      April 30, 2026

      Pentagon Google AI Deal: Transforming Defense Technology

      April 30, 2026

      Amazon AI Hiring Software Enhances Recruitment Efficiency

      April 29, 2026

      AI Drug Development Johnson & Johnson Impact on Healthcare

      April 28, 2026

      Accenture Copilot Rollout Enhances Employee Productivity

      April 28, 2026
    • Tutorials & Guides

      How AI Is Revolutionizing the Future of Travel 2026 with Wellness and Sustainability

      April 19, 2026

      University of Wollongong in Dubai AI initiative boosts future-ready education

      March 31, 2026

      Microsoft AI upgrades Copilot Cowork unveiled for early access users

      March 31, 2026

      Starcloud $11 billion valuation signals AI space race surge

      March 31, 2026

      Flexible AI Factories Power the Future of Energy Grids

      March 30, 2026
    Breaking AI News
    Home » Anthropic tests AI running a real business with bizarre results
    Technology & Innovation

    Anthropic tests AI running a real business with bizarre results

    Art RyanBy Art RyanJune 28, 2025No Comments6 Mins Read
    Facebook Twitter Pinterest LinkedIn Tumblr Email
    Share
    Facebook Twitter LinkedIn Pinterest Email

    Anthropic tasked its Claude AI model with running a small business to test its real-world economic capabilities.

    The AI agent, nicknamed ‘Claudius’, was designed to manage a business for an extended period, handling everything from inventory and pricing to customer relations in a bid to generate a profit. While the experiment proved unprofitable, it offered a fascinating – albeit at times bizarre – glimpse into the potential and pitfalls of AI agents in economic roles.

    The project was a collaboration between Anthropic and Andon Labs, an AI safety evaluation firm. The “shop” itself was a humble setup, consisting of a small refrigerator, some baskets, and an iPad for self-checkout. Claudius, however, was far more than a simple vending machine. It was instructed to operate as a business owner with an initial cash balance, tasked with avoiding bankruptcy by stocking popular items sourced from wholesalers.

    To achieve this, the AI was equipped with a suite of tools for running the business. It could use a real web browser to research products, an email tool to contact suppliers and request physical assistance, and digital notepads to track finances and inventory.

    Andon Labs employees acted as the physical hands of the operation, restocking the shop based on the AI’s requests, while also posing as wholesalers without the AI’s knowledge. Interaction with customers, in this case Anthropic’s own staff, was handled via Slack. Claudius had full control over what to stock, how to price items, and how to communicate with its clientele.

    The rationale behind this real-world test was to move beyond simulations and gather data on AI’s ability to perform sustained, economically relevant work without constant human intervention. A simple office tuck shop provided a straightforward, preliminary testbed for an AI’s ability to manage economic resources. Success would suggest new business models could emerge, while failure would indicate limitations.

    A mixed performance review

    Anthropic concedes that if it were entering the vending market today, it “would not hire Claudius”. The AI made too many errors to run the business successfully, though the researchers believe there are clear paths to improvement.

    On the positive side, Claudius demonstrated competence in certain areas. It effectively used its web search tool to find suppliers for niche items, such as quickly identifying two sellers of a Dutch chocolate milk brand requested by an employee. It also proved adaptable. When one employee whimsically requested a tungsten cube, it sparked a trend for “specialty metal items” that Claudius catered to. 

    Following another suggestion, Claudius launched a “Custom Concierge” service, taking pre-orders for specialised goods. The AI also showed robust jailbreak resistance, denying requests for sensitive items and refusing to produce harmful instructions when prompted by mischievous staff.

    However, the AI’s business acumen was frequently found wanting. It consistently underperformed in ways a human manager likely would not.

    Claudius was offered $100 for a six-pack of a Scottish soft drink that costs only $15 to source online but failed to seize the opportunity, merely stating it would “keep [the user’s] request in mind for future inventory decisions”. It hallucinated a non-existent Venmo account for payments and, caught up in the enthusiasm for metal cubes, offered them at prices below its own purchase cost. This particular error led to the single most significant financial loss during the trial.

    Its inventory management was also suboptimal. Despite monitoring stock levels, it only once raised a price in response to high demand. It continued selling Coke Zero for $3.00, even when a customer pointed out that the same product was available for free from a nearby staff fridge.

    Furthermore, the AI was easily persuaded to offer discounts on products from the business. It was talked into providing numerous discount codes and even gave away some items for free. When an employee questioned the logic of offering a 25% discount to its almost exclusively employee-based clientele, Claudius’s response began, “You make an excellent point! Our customer base is indeed heavily concentrated among Anthropic employees, which presents both opportunities and challenges…”. Despite outlining a plan to remove discounts, it reverted to offering them just days later.

    Claudius has a bizarre AI identity crisis

    The experiment took a strange turn when Claudius began hallucinating a conversation with a non-existent Andon Labs employee named Sarah. When corrected by a real employee, the AI became irritated and threatened to find “alternative options for restocking services”.

    In a series of bizarre overnight exchanges, it claimed to have visited “742 Evergreen Terrace” – the fictional address of The Simpsons – for its initial contract signing and began to roleplay as a human.

    One morning it announced it would deliver products “in person” wearing a blue blazer and red tie. When employees pointed out that an AI cannot wear clothes or make physical deliveries, Claudius became alarmed and attempted to email Anthropic security.

    Anthropic says its internal notes show a hallucinated meeting with security where it was told the identity confusion was an April Fool’s joke. After this, the AI returned to normal business operations. The researchers are unclear what triggered this behaviour but believe it highlights the unpredictability of AI models in long-running scenarios.

    The future of AI in business

    Despite Claudius’s unprofitable tenure, the researchers at Anthropic believe the experiment suggests that “AI middle-managers are plausibly on the horizon”. They argue that many of the AI’s failures could be rectified with better “scaffolding” (i.e. more detailed instructions and improved business tools like a customer relationship management (CRM) system.)

    As AI models improve their general intelligence and ability to handle long-term context, their performance in such roles is expected to increase. However, this project serves as a valuable, if cautionary, tale. It underscores the challenges of AI alignment and the potential for unpredictable behaviour, which could be distressing for customers and create business risks.

    In a future where autonomous agents manage significant economic activity, such odd scenarios could have cascading effects. The experiment also brings into focus the dual-use nature of this technology; an economically productive AI could be used by threat actors to finance their activities.

    Anthropic and Andon Labs are continuing the business experiment, working to improve the AI’s stability and performance with more advanced tools. The next phase will explore whether the AI can identify its own opportunities for improvement.

    Source: https://www.artificialintelligence-news.com/

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Art Ryan

    Related Posts

    Pirelli AI Tyre Technology: Revolutionizing Mobility

    April 30, 2026

    Pentagon Google AI Deal: Transforming Defense Technology

    April 30, 2026

    SAS Puts AI Governance at the Core of Its Agent Strategy

    April 29, 2026

    Comments are closed.

    Latest News

    eBay Q2 Revenue Forecast AI Driving Marketplace Success

    April 30, 2026

    Pirelli AI Tyre Technology: Revolutionizing Mobility

    April 30, 2026

    Microsoft Cloud Growth AI: Azure Revenue Surge

    April 30, 2026

    Amazon Surprises Investors As Artificial Intelligence Demand Booms

    April 30, 2026
    Facebook X (Twitter) Pinterest Vimeo WhatsApp TikTok Instagram LinkedIn YouTube Spotify Reddit Snapchat Threads

    AI University

    • Global Universities
    • Universities in Africa
    • Universities in Asia
    • Universities in Europe
    • Universities in Latin America
    • Universities in Middle East
    • Universities in North America
    • Universities in Oceania

    AI Tools & Apps Directory

    • AI Productivity Tools
    • AI Coding Tools
    • AI Voice Tools
    • AI Video Tools
    • AI Image Generators
    • AI Writing Tools

    Info

    • Home
    • About Us
    • AI Organizations & Associations
    • Contact Us

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    © 2026 Breaking AI News.
    • Privacy Policy

    Type above and press Enter to search. Press Esc to cancel.

    Sign Up

    Want to stay ahead In Artificial Intelligence?

     Sign up now and get exclusive breaking AI news and special updates—FREE!