A specialized AI security agent built by Cecuro detected vulnerabilities in 92 percent of real-world exploited DeFi smart contracts, covering $96.8 million in verified exploit value. A baseline agent running on GPT-5.1 with generic prompting caught just 34 percent, covering only $7.5 million. The study, published on February 20, 2026, evaluated 90 contracts that collectively lost $228 million to real attacks, and the findings reframe how the crypto industry should think about automated security.
The 92 vs 34 Percent Gap Is Not About Model Quality
The most counterintuitive finding in Cecuro's benchmark is that both systems ran on the same frontier AI model. The specialized agent did not use a better or larger language model. It used the same core engine as the GPT-5.1 baseline. The difference was entirely in what Cecuro calls "domain-specific security methodology layered on top of the model, not differences in core AI capability."
That methodology includes structured review phases that mirror how experienced human auditors approach contract analysis, DeFi-focused security heuristics tuned to the specific patterns that attackers exploit, and domain-specific analysis protocols that guide the model through the logic of flash loans, reentrancy, oracle manipulation, and access control failures.
In practical terms, the generic model looked at a contract and saw code. The specialized model looked at the same contract and saw attack surfaces.
$228 Million in Verified Losses, 90 Contracts, One Dataset
The benchmark dataset is not hypothetical. Every contract in the evaluation was exploited in a real attack with verified on-chain losses. The 90 contracts span the major categories of DeFi exploits: flash loan attacks, price oracle manipulation, reentrancy bugs, access control failures, and logic errors.
Several of the contracts in the dataset had undergone professional security audits before being exploited. This is a critical detail. It means the specialized AI is not just catching obvious bugs that any tool would find. It is identifying vulnerabilities that paid human auditors missed, in code that was formally reviewed and shipped to production with a green light.
The $96.8 million in exploit value detected by Cecuro's agent represents real money that could have been saved if the tool had been applied before the attack. The $7.5 million caught by the baseline represents the ceiling of what generic AI tooling delivers today without domain specialization.
The Offense Is Cheaper Than You Think
Cecuro's benchmark arrives alongside separate research from Anthropic and OpenAI showing that the offensive side of the AI security equation is advancing even faster. AI agents can now execute end-to-end exploits on most known vulnerable smart contracts, with the capability reportedly doubling roughly every 1.3 months.
The average cost to exploit a vulnerable contract with an AI agent is approximately $1.22. That is not a typo. For the price of a gas station coffee, an automated system can identify a vulnerability, construct the exploit payload, and drain the contract.
This creates a deeply asymmetric threat landscape. Attackers need to succeed once. Defenders need to succeed every time. And when the cost of an attack drops below $2, the volume of probing and exploitation attempts scales to a level that no human audit team can match through manual review alone.
The OpenAI and Paradigm EVMbench study published earlier showed AI exploiting 70 percent of critical smart contract bugs. Cecuro's work is the defensive mirror: specialized AI catching 92 percent of what the offense throws.
What DeFi Users and Wallet Holders Should Watch
For anyone holding assets in DeFi protocols, lending pools, or self-custody wallets that interact with smart contracts, the practical takeaway is layered.
First, the audit report on a protocol you trust may not be worth what you think. If professionally audited contracts still got exploited, the audit is a risk reduction measure, not a guarantee. Cecuro's benchmark makes this concrete: 92 percent detection by a specialized AI tool means even the best automated systems still miss 8 percent of real exploits.
Second, the velocity of AI-powered attacks is accelerating faster than defenses. A 1.3-month capability doubling rate means the threat profile of DeFi smart contracts changes materially every quarter. Protocols that were secure six months ago may have new attack surfaces that only emerged as AI capabilities improved.
Third, this research validates the security model behind self-custody crypto cards that minimize smart contract exposure. Cards like Gnosis Pay and MetaMask that connect directly to user-controlled wallets reduce the number of intermediate contracts that could be exploited. Fewer contract interactions means fewer attack surfaces.
The Open-Source Play and Its Limits
Cecuro open-sourced the benchmark dataset, the evaluation framework, and the baseline agent on GitHub. This means any security team, protocol developer, or researcher can test their own tools against the same 90 exploited contracts and compare results.
What Cecuro did not open-source is its full specialized security agent, citing misuse concerns. This is the standard tension in security research: publishing the methodology helps defenders, but publishing the complete tool could also help attackers build better exploit agents. Cecuro chose the middle path, releasing enough for the industry to benchmark but not enough to weaponize.
The open dataset has immediate value for DeFi protocols evaluating security tooling. Instead of trusting vendor claims, teams can run competing tools against verified exploits and measure detection rates empirically. This is the kind of transparency the industry has lacked, where audit firms compete on reputation and relationships rather than verifiable performance data.
For crypto card users, the broader implication is that the protocols underlying your spending are entering an AI-driven security era. The cards that route through the fewest smart contracts, or that use audited and battle-tested contracts, carry the lowest exposure. Stablecoin-based cards that hold USDC or USDT without complex DeFi interactions avoid most of these risks entirely.
The Arms Race Has a Clock
The 1.3-month doubling rate for AI exploit capabilities sets a timeline that the entire DeFi ecosystem needs to internalize. At that pace, offensive AI will be roughly 8x more capable by the end of 2026 than it was at the start. If defensive tooling like Cecuro's does not scale at the same rate or faster, the gap between what attackers can do and what defenders can catch will widen.
The 92 percent detection rate is impressive today. Whether it holds against next-generation attack vectors depends on whether specialized security AI can keep pace with specialized exploit AI. Cecuro's benchmark gives the industry a measuring stick. The next question is whether anyone is running fast enough.
FAQ
How did the specialized AI achieve 92 percent detection? Cecuro layered DeFi-specific security heuristics, structured review phases, and domain-specific analysis protocols on top of the same frontier AI model used by the baseline. The improvement came from methodology, not model size.
Does this mean DeFi is safe now? No. The 92 percent detection rate still means 8 percent of real exploits were missed. And the benchmark tests known historical exploits, not novel zero-day attacks. DeFi security is improving but remains a cat-and-mouse game.
How much does it cost to exploit a smart contract with AI? According to separate research from Anthropic and OpenAI, the average cost is approximately $1.22 per contract. This extremely low cost means exploit attempts will only increase in volume.
Is the benchmark dataset available? Yes. Cecuro open-sourced the dataset of 90 exploited contracts, the evaluation framework, and the baseline agent on GitHub. The full specialized security agent was withheld due to misuse concerns.
What should DeFi users do to protect themselves? Diversify across protocols, favor battle-tested contracts, limit exposure to unaudited or newly launched protocols, and consider spending options that minimize smart contract interactions.
Overview
Cecuro's AI security benchmark reveals a stark divide in DeFi defense: a specialized agent detected 92 percent of real-world exploits across 90 contracts worth $228 million, while a generic GPT-5.1 baseline caught only 34 percent. The gap came from methodology, not model quality. With AI exploit capabilities doubling every 1.3 months at a cost of $1.22 per attack, the race between offense and defense is accelerating. Cecuro open-sourced the benchmark dataset for the industry to test against, but withheld its full agent. For DeFi users and crypto card holders, the message is clear: smart contract exposure is the new frontier of risk, and the tools to measure it finally exist.
Recommended Reading
- OpenAI and Paradigm Launch EVMbench as AI Exploits 70 Percent of Critical Smart Contract Bugs
- A Single Misconfigured Oracle Valued cbETH at $1.12 Instead of $2,200, Draining $1.78 Million From Moonwell
- Phantom Launches an MCP Server That Lets AI Agents Sign, Swap, and Transfer Tokens Across Four Blockchains








