AI Vulnerability Scanners Miss 78% of Critical Flaws, Study Finds
Organizations are abandoning fully automated testing as false negatives erode trust in AI-powered security tools.

Trust in Automated Security Testing Plummets
Organizations are rapidly retreating from fully automated AI vulnerability scanning after discovering these tools miss the majority of critical security flaws. The proportion of companies relying entirely on AI automation for security testing dropped from 29% to just 9% in one year, according to research from Cobalt.
The shift reflects a sobering reality: 78% of cybersecurity professionals report that fully automated scanning tools fail to detect critical vulnerabilities. This high rate of false negatives has prompted nearly half of organizations (47%) to adopt hybrid testing models that combine automated tools with human expertise.
Why it matters
As enterprises rush to deploy AI systems, the expanding attack surface demands more sophisticated security validation than current automated tools can provide. The gap between AI deployment speed and security testing capability creates significant business risk, particularly as LLM vulnerabilities take twice as long to remediate compared to traditional software flaws.
The AI Security Challenge Grows More Complex
The complexity of AI systems themselves drives much of the testing failure rate. Nearly one in three findings from AI penetration tests carries a high-risk rating—2.7 times higher than conventional software, according to Cobalt's State of Pentesting Report 2026.
Large language model vulnerabilities present particular challenges. Only 38% of identified LLM security issues had been resolved at the time of analysis, leaving 62% unaddressed—the lowest fix rate across all asset categories. Mean time to resolve AI and LLM security problems doubled from 19 days to 36 days year-over-year.
"LLM vulnerabilities are deeply context-dependent and invisible to tools that lack an architectural understanding of the application," said Andrew Obadiaru, CISO of Cobalt. "To close the validation gap, automation should be deployed exactly where it excels, but elite human expertise remains foundational to uncovering and remediating the most complex business logic risks."
Shadow AI Leads Incident Types
Among organizations that experienced AI-related security incidents, shadow AI topped the list at 44%, followed by data or model poisoning (41%) and improper output handling (41%). Supply chain vulnerabilities (35%) and prompt injection attacks (34%) rounded out the five most common attack vectors.
The research revealed a concerning gap between recognized needs and planned action. While 60% of security professionals acknowledged they need stronger LLM testing capabilities, only 42% plan to increase human-led red team operations.
The findings draw from two comparative surveys conducted in 2025 and 2026 involving approximately 450 cybersecurity professionals. Organizations are increasingly deploying automation specifically for low-risk environments, with adoption in that category rising 22 percentage points to 47%.
Obadiaru cautioned that emerging AI security tools may compound existing problems. "While the industry is rightfully excited about the potential of Mythos-class tools, unguided algorithms are inherently prone to returning even more false positives and costly false negatives than the automated scanners we have today," he noted.
The data was first reported by Infosecurity Magazine based on Cobalt's State of Pentesting Report 2026.
This is an original analysis by the Omega editorial team. Source reporting: Automation Watch.
Want systems like this working for your business?
Book a Call
