Winning a 2025 security contest in <20 minutes

0xfirefistJune 4, 20263 min read

Solidity AuditorV3

How we got here

So far we have launched v1 & v2 of the solidity-auditor security Skill. It’s already been used by thousands of developers and security researchers, finding a ton of Critical & High severity issues. Many people use the tool for bug bounties as well. Still, we wanted to make it better and better.

After over 150 runs and many different approaches — new agents, more iterations, extra layers, up-front scoping, we have our results ready.

Case Study: DoDo Contest

DODO Cross-Chain DEX was a security contest that ran for 8 days, with ~1600 lines of code and ~100 security researcher participants. Their efforts discovered 5 High and 12 Medium severity vulnerabilities.

Solidity-auditor v3, our latest AI Security Skill, found 14 out of 17 findings — 82.4% recall— in less than 20 minutes. As a comparison, the best performing security researcher found 8 vulnerabilities in total.

The two approaches that increased recall %

1. A shared reasoning discipline

Every agent now follows the same 3 senior-auditor practices:

Feynman— if you can’t explain a function in plain words, you don’t understand it yet. The jargon is where the bug hides.
Socratic— drill past the first answer to the assumption underneath it.
Inversion— the developer asks “does this work?” We ask “how do I break this?”

2. New gap hunter agents

Most specialists work one lens — arithmetic, access, economics, external calls. But the hairiest bugs live in the seams between two or three lenses, where every individual specialist would falsely say “nothing is wrong here”.

Three new agents hunt only the seams:

Flow-gap— a callback hands control away mid-execution, and the code after it trusts state from before. Each step is correct, but the sequence is wrong.
Numerical-gap— an invariant that holds in real numbers math but breaks under integer rounding. For example, a fee that truncates to zero.
Trust-gap— deposit logic is priced on spot, while withdraw — on TWAP. Individually they are reasonable, together they lead to a free trade.

Comparison vs other open source security tools

We ran solidity-auditor v3 against the most popular & top performing open source tools out there.

Here is how we compare on recall:

Recall

Across all 4 codebases, solidity-auditor v3 leads on recall % (and ties Nemesis on Ammplify).

When it comes to token spend and time, here are the results:

Run time

Across all 4 codebases, solidity-auditor v3 stays in the ~19–28 min range, while being multiple times faster on average than most tools.

Token consumption

When it comes to “token spend”, solidity-auditor v3 is more expensive only compared to solidity-auditor v2 and Claude Code itself.

Installation & usage of solidity-auditor v3

Follow the instructions in the README here:

Repository

github.com/pashov/skills

solidity-auditor is an AI assistant. AI analysis can never verify the complete absence of vulnerabilities, and no guarantee of security is given.

For security consulting with elite experts, visit pashov.com, or reach out directly via Telegram.