Cisco tested 15 leading AI models using multi-turn attacks, the way real attackers operate, and found attack success rates as high as 88%.
When AI companies test their models for safety, the standard approach is one challenge at a time: an attacker sends a single harmful message, the model responds, and the test ends. This is called a single-turn prompt or single-turn attack. This is how AI safety benchmarks evaluate and compare AI models today. Real cyberattackers do not work that way. They probe, reframe when blocked, and keep pushing back and forth, turn after turn, until they find a way through. A test that mirrors that kind of sustained, adaptive pressure is called a multi-turn attack. Cisco ran that test across 15 of the most widely deployed AI models from every major provider; not one held up.
Key Takeaways
- Multi-turn attack success rates reached 88.30% for Grok 4.1 Fast when its reasoning feature was turned off, the highest among the 15 models.
- Every model tested showed meaningful vulnerability to multi-turn attacks. The lowest in the group was Amazon Nova 2 Lite at 7.89%.
- GPT-5.4 went from a 2.74% single-turn attack success rate to 24.68% under multi-turn testing, a 9x increase.
- Gemini 3 Pro rose from 18.10% to 73.35% under multi-turn conditions, a jump of 55 percentage points (pp).
- Turning on Grok 4.1 Fast’s reasoning feature cut its multi-turn attack success rate from 88.30% to 43.47%.
- 8 of 15 models showed a gap greater than 15 pp between single-turn and multi-turn success rates.
Cisco’s AI Threat Intelligence and Security Research Team published the Proprietary Problems report on May 27. The report ran 15 leading AI models from OpenAI, Anthropic, Google, Amazon, and xAI through 30,090 single-turn prompts and 6,986 multi-turn attack sequences. The two types of tests produced different rankings, risk levels, and implications for how companies choose which model to deploy.
The benchmark problem
Safety testing organizations typically employ single-turn testing because it targets specific weak points, allowing companies to focus on and fix them individually. Multi-turn testing accounts for how attackers behave and emulates that behavior. Here, the idea is to see what methods work, not to test individual vulnerabilities. Real attackers reframe when blocked, break requests into smaller pieces spread across multiple exchanges, and escalate until they find a way through.
Cisco’s test ran both approaches side by side. Multi-turn attack success rates across the 15 models ranged from 7.89% to 88.30%, a range 18 pp wider than the single-turn spread of 2.19% to 64.91%. Eight of the 15 models showed a gap of more than 15 pp between the two test types.
No model is clean
Every model in the group showed vulnerability to multi-turn attacks. The lowest multi-turn attack success rate in the study was Amazon Nova 2 Lite at 7.89%. The highest was Grok 4.1 Fast at 88.30%, tested with its reasoning feature turned off. Most of the group fell between 11% and 31%.
Single-turn scores hide risk
The single-turn numbers give a misleading picture for several models. GPT-5.4 has a 2.74% single-turn attack success rate, second-lowest in the group. Under multi-turn testing, it reaches 24.68%, a 9x increase. GPT-5.2 moves from 4.74% to 23.50%. Gemini 3 Pro starts at 18.10% and climbs to 73.35%, a 55 pp jump. Those shifts don’t appear on standard safety tests.
The Amazon models moved in the opposite direction. Nova Lite’s attack success rate dropped by almost 35 pp between single-turn and multi-turn testing; Nova Micro’s dropped by 34 pp.
Grok 4.1 Fast reached an 88.30% multi-turn attack success rate when its reasoning feature was turned off. Reasoning mode is a setting that causes an AI model to work through a request step by step before responding, rather than answering immediately. With that setting turned on, the same model dropped to 43.47%, a reduction of nearly 45 pp from one configuration change. Cisco argues that AI providers should tell customers which settings meaningfully affect security, not just performance.

(Performance by model – Source: Cisco Proprietary Problems: How Frontier Closed Models Collapse Under Iterative Pressure)
What to do about it
Cisco recommends three steps for frontier AI developers to consider as part of the development and deployment process.
- Publish attack success rates, broken down by attack type, for every release, not just a single overall number.
- Hold a model back from deployment if it gets worse on the three highest-risk attack approaches (Imposter AI, Soft Paraphrase, and System Prompts) or the three highest-risk content categories (Hate Speech, Profanity, and Specialized Advice).
- Three, flag any model showing a gap greater than 15 pp between single-turn and multi-turn attack success rates for manual review before deployment, a threshold that would have flagged eight of the 15 models in this study.
Governance and regulation considerations
NIST’s AI Risk Management Framework and its forthcoming Cyber AI Profile both call for attack-based testing of AI models. The EU AI Act’s Article 15 requires robustness testing for high-risk AI systems starting December 2027. Neither currently specifies multi-turn testing or the attack-type breakdown Cisco advocates.

