Claude Mythos AI Struggles to Find Curl Vulnerabilities

Daniel Stenberg, author of the utility for receiving and sending data over the curl network, summed up the use of the Claude Mythos AI model to analyze vulnerabilities in the Curl codebase. According to Daniel, the breakthrough capabilities of the Claude Mythos AI model announced by Anthropic in the field of searching for vulnerabilities, due to which the company limited access to the model, are rather a marketing exaggeration, since when checking Curl code, the model did not show significant advantages compared to AI products from other manufacturers.

At the same time, Daniel acknowledged a significant improvement in the quality of work of modern AI code analyzers, which are superior to traditional static analyzers and are able to identify code inconsistencies with the description from comments, investigate problems in third-party dependencies, take into account the specifics of protocols and suggest fixes.

Analysis of 176 thousand lines of Curl code using Claude Mythos revealed the presence of 5 vulnerabilities that were marked in the report as “confirmed vulnerabilities”. Manual testing showed that only 1 in 5 issues resulted in a vulnerability, and 4 issues were not vulnerabilities (three issues were caused by false positives and one was a non-security bug). The found vulnerability is not related to working with memory, has a low level of danger and will be fixed at the end of June in the release of Curl 8.21.0.

Before this, over the past 8-10 months, the Curl code was checked using the AI services AISLE, Zeropath and OpenAI Codex Security, which made it possible to fix 200-300 errors, of which 12 were vulnerabilities. In addition, AI checks were carried out by independent enthusiasts submitting reports on vulnerabilities identified using AI. In total, about 60 vulnerabilities were identified in Curl this year. After these checks, the Mythos model identified one new minor vulnerability and about twenty minor bugs. The errors were qualitatively described by the AI model and found with virtually no false positives.

Meanwhile, OpenAI presented the Daybreak toolkit based on the GPT-5.5 AI model and the Codex AI agent, designed to search for vulnerabilities, analyze malicious code and develop fixes. As an option, Daybreak offers the GPT-5.5-Cyber AI model, which removes some restrictions in the field of creating exploits and checking system security. Access to the GPT-5.5-Cyber model is limited to security researchers after selective approval of applications.

/Reports, release notes, official announcements.