AI Vulnerability Detection and Zero-Day Discovery

AI vulnerability detection moves from experiment to frontline defense

AI vulnerability detection is the use of advanced machine learning models and autonomous agents to scan code, identify exploitable flaws, and suggest or generate fixes far faster and at far greater scale than traditional manual security reviews. Anthropic’s Project Glasswing shows how quickly this is becoming operational. The company has expanded access to Claude Mythos Preview from an initial 50 organizations to around 200 partners in total, spanning roughly 15 countries and industries from cloud providers to utilities and healthcare. These partners point Mythos-class models at massive, production codebases and report thousands of high- or critical-severity issues uncovered. Anthropic says Mythos has already found “thousands of high-severity vulnerabilities” in major operating systems and browsers, and warns that AI models “can surpass all but the most skilled humans at finding and exploiting software vulnerabilities.”

How AI Is Finding Decades-Old Software Vulnerabilities Faster Than Security Teams

Claude Mythos security and the scale of Project Glasswing

Project Glasswing is Anthropic’s effort to democratize Claude Mythos security capabilities while keeping them in trusted hands. With approximately 150 new organizations joining the original cohort, more than 200 partners now receive controlled access to Mythos Preview under strict security requirements. These partners include vendors and maintainers whose code underpins infrastructure for power, water, healthcare, communications, and hardware, as well as large technology firms. In early runs, participants used Mythos-class models on their own repositories and surfaced over 10,000 high- or critical-severity vulnerabilities, forcing a shift from “finding bugs” to verifying, disclosing, and patching them. Anthropic has also launched Claude Security, a service based on its public Opus models that scans codebases and suggests patches, and is sharing internal tools with trusted teams so they can perform bug detection at scale, not just with Mythos but across broader defensive workflows.

AI agents and the zero-day discovery in FFmpeg

The FFmpeg case shows how autonomous agents can expose long-hidden flaws. Security startup depthfirst ran an AI-based agent across roughly 1.5 million lines of C in the FFmpeg media library and uncovered 21 previously unknown zero-day vulnerabilities, each with a reproducible proof-of-concept. The company estimates the scan cost about USD 1,000 (approx. RM4,600), putting industrial-scale zero-day discovery within reach of many defenders—and attackers. Several bugs had been dormant for 15 to 20 years; one stack overflow in service-description-table code dated to 2003 and had gone unnoticed for 23 years. Most issues were heap or stack overflows in parsers and demuxers, from the TS demuxer to the VP9 decoder, with multiple CVE identifiers already assigned. This kind of AI-powered zero-day discovery underscores how legacy codebases, once thought stable, can be systematically mined for exploitable weaknesses.

Chrome’s 429 fixes and the volume problem in software security automation

While the FFmpeg zero-days came from an AI agent, Chrome 149 shows what happens when AI-accelerated reporting hits mature security pipelines. Google’s latest release patched 429 vulnerabilities, the most in a single Chrome version, including over 100 rated critical or high severity. The worst, CVE-2026-10881 with a CVSS score of 9.6, is an out-of-bounds read and write in the ANGLE graphics engine that can let a crafted page escape the sandbox and execute code on the host. According to Google, 19 of the 22 critical bugs and most high-severity issues were discovered internally, but the company had to overhaul its bounty program to cope with a flood of AI-generated submissions, now prioritizing concise reproducers over long AI-written reports. The lesson is clear: software security automation increases bug volume so quickly that the main bottleneck becomes triage and patch deployment, not vulnerability discovery.

From reactive patching to proactive AI-driven security at scale

Taken together, Claude Mythos security initiatives and FFmpeg’s AI-found zero-days mark a turning point: enterprises are moving from reactive patching to proactive vulnerability discovery at scale. In Glasswing, partners are not waiting for incidents; they are systematically sweeping their codebases with Mythos-class models and sharing best practices for triage and disclosure. The FFmpeg and Chrome stories show that once AI raises the ceiling on bug detection, organizations must redesign processes, incentives, and infrastructure to keep up with the flood of findings. Anthropic openly states that its role is shifting “from finding vulnerabilities to disclosing, fixing, and deploying patched software,” highlighting that automation must extend beyond scanning into remediation. For defenders, the near future is one where AI vulnerability detection and software security automation run continuously, turning codebases into living systems that are inspected, prioritized, and repaired as fast as new bugs appear.