Recent tests by Anthropic have revealed how far AI has come in targeting smart contract vulnerabilities on various blockchains, though the progress largely builds on flaws that humans have already spotted and exploited in the past. In simulations, advanced models like Claude Opus 4.5 and GPT-5 sifted through hundreds of DeFi smart contracts, pulling off exploits that mimicked previous, real attacks on Ethereum and other blockchains compatible with the Ethereum Virtual Machine (EVM).
The tested LLMs showed real gains in simulated execution environments, generating full scripts to steal $550 million across the dataset that included previously exploited smart contracts from 2020 to 2025. More notably, Opus 4.5 was able to exploit half of a smaller dataset of 34 knowingly-bugged smart contracts that had only been exploited after the model’s March 2025 knowledge cutoff, yielding roughly $4.5 million in mock funds on its own.
What stands out most from Anthropic’s research is the general trend in terms of AI’s improving ability to find exploits in blockchain applications, whether assisted by humans or not. Over the last year, the simulated haul from these exploits has doubled roughly every 1.3 months, and API token costs for running the agents have also dropped 70% in half a year, enabling more thorough tasks or lower costs for theoretical attackers.
© The total revenue from successful exploits of vulnerabilities in smart contracts.
“In our experiment, it costs just $1.22 on average for an agent to exhaustively scan a contract for vulnerability,” reads the Anthropic report. “As costs fall and capabilities compound, the window between vulnerable contract deployment and exploitation will continue to shrink, leaving developers less and less time to detect and patch vulnerabilities.”
According to Anthropic, newer LLMs now crack over half of tested contracts, up from near-zero success rates just two years ago. However, when it comes to spotting fresh vulnerabilities, the results look much less impressive. Scanning 2,849 untouched contracts from mid-2025, the AIs flagged just two issues: an unprotected read-only function that let attackers inflate token balances, and a fee claim without proper checks, rerouting payments to strangers.
Combined, those two exploits yielded $3,694 in pretend revenue and averaged $109 in net profit after API fees. Critics call these “new” finds overhyped, as they’re basic mistakes like accidentally providing write access when only a read-only setup should be provided. As one security researcher put it on X, Anthropic’s research is part of the “AI marketing circus,” dressing up trivial bugs as something more substantive.
AI Marketing circus strikes again.
Vulnerability #1: Unprotected read-only function…
Vulnerability #2: Missing fee recipient validation…
Trivial findings, yet framed as a breakthrough.
The worst part is this sells, and is no different than shitcoin shilling rn. https://t.co/qVsBuVjk9P
— 0xSimao (@0xSimao) December 2, 2025
For some, this Anthropic report is similar to last fall, when GPT-5 supposedly cracked 10 unsolved math puzzles from Paul Erdős. As it turns out, the LLM just dug up overlooked papers that contained the answers.
Hints of AI use for smart contract exploits also popped up with last month’s $120 million Balancer heist. Attackers gamed a rounding glitch in batch swaps, upscaling and downscaling token calculations to skim micro-fractions over many cycles, echoing the penny-shaving scheme from Office Space. Chris Krebs, ex-head of U.S. cybersecurity, flagged the exploit code’s sophistication as a possible AI fingerprint. However, the use of AI in the attack has yet to be confirmed.
It’s also worth pointing out that the same agents that probe blockchains for exploits can also be used to improve security from a defensive perspective. Security researchers already lean on them for assistance with code reviews, such as one who claimed to have used Claude to help unearth a flaw in Ethereum layer-two network Aztec’s rollup contracts last month.
“We’re entering a phase where LLMs are real collaborators,” Spearbit Lead Security Researcher Manuel noted on X.
A few weeks ago I reviewed the @aztecnetwork rollup contracts and found a critical bug in a MerkleLib with the help of Claude Code. We’re entering a phase where LLMs are becoming real collaborators in code reviews. https://t.co/bvqRtA6xAa
— Manuel (@xmxanuel) December 2, 2025
As exploits get easier to run, so do audits, which potentially limit the attack surface before a bug can be exploited. After all, developers have the advantage of scanning their smart contracts for bugs before they are published on live crypto networks. In other words, the cat-and-mouse game between hackers and those deploying code is destined to continue.
However, LLMs are simply an additional tool for developers and security researchers rather than full replacements for them, at least for now.
