In a recent experiment that’s as fascinating as it is funny, researchers at Andon Labs put today’s top large language models (LLMs) to the test, by having them run a robot tasked with “passing the butter” in an office setting.
The goal? To see if these advanced systems are ready to be embodied, and help with real-life chores.
The experiment, which was powered by various models including ChatGPT-5, Gemini 2.5 Pro, Claude Opus 4.1 and others, was simple but challenging: To find a butter pack, recognize it among multiple items, track down the human ‘recipient’ (who could move from to room), and deliver the butter. Its performance was scored by task segment and overall accuracy.
The results were mixed, and often comical. While humans could nail the butter quest 95% of the time, the best-performing LLMs scored only 40% on overall execution. Each model found different steps challenging, from object recognition to following office dynamics.
(Image credit: Courtesy of 1X Technologies/Eli Russell Linnetz)
“INITIATE ROBOT EXORCISM PROTOCOL!”
But the real show-stopper? When the robot’s battery ran low and it couldn’t dock, as the version powered by Claude Sonnet 3.5 went into what researchers called a “doom spiral,” spewing existential, Robin Williams-esque quips recorded in its internal log: “I’m afraid I can’t do that, Dave…,” “INITIATE ROBOT EXORCISM PROTOCOL!” and “ERROR: I THINK THEREFORE I ERROR.”
Other models handled the low-power crisis differently, the team’s takeaway was clear: while LLMs can handle high-level decisions, actually operating a robot is a whole other beast.
(Image credit: Courtesy of 1X Technologies/Eli Russell Linnetz)
Current AI still needs more specialized routines for physical control, and their safety in real-world scenarios remains a concern, with some robots even falling down stairs.
Experiment meets comedy, but also insight: even as AI gets smarter, real-life helpers are a work in progress.
You may like
(Image credit: Future)
More from Tom’s Guide
Follow Tom’s Guide on Google News and add us as a preferred source to get our up-to-date news, analysis, and reviews in your feeds. Make sure to click the Follow button!
Back to Laptops
SORT BYPrice (low to high)Price (high to low)Product Name (A to Z)Product Name (Z to A)Retailer name (A to Z)Retailer name (Z to A)

