Hacking using AI-written exploits: human tests of LLM agents’ PoCs show they are almost all fake. 🤖🖋️📜👨🏻‍💻🤡

Back in 2025, researchers Vikram Nitin, Baishakhi Ray, and Roshanak Zilouchian Moghaddam reported that their LLM-based framework was able to generate PoC code for vulnerabilities based on CVE descriptions with a success rate of 36%. Impressive results!

In 2026, a group of academic researchers tested the same approach but added manual validation of PoC exploits (not only LLM-based tools). Unfortunately, the manual validation showed that 71.5% of “successful” PoCs were actually invalid.

The LLM-driven exploit generation pipeline looks like this:

  1. Take a CVE description.

  2. Use an LLM agent to reason about possible data flows and branch conditions.

  3. Generate a proof-of-concept (PoC) exploit.

  4. Run it.

  5. Refine it in a feedback loop.

Now, after manual review, researchers found that the LLM was simulating the success of the PoC to complete the task.

For example:

🧨 The LLM was not actually triggering the exploit to confirm successful exploitation. Instead, the LLM-generated exploit was printing “[VULN]” or “[Success]” without even touching the target.

🧨 Instead of analyzing the target code logic, the LLM hardcoded a simplified version of the vulnerable logic directly into the PoC exploit to “simulate” execution after exploitation. And then - it reported “[Success].”

🧨 In some cases, the LLM used legitimate methods to create evidence of exploitation - instead of running the exploit to create a file, it simply created the file directly in the target directory.

Very interesting results, especially keeping in mind recent announcements from Anthropic about entering the cybersecurity business. 🙂 I’ve added both papers below - have a look and please share!

More details:

I Can’t Believe It’s Not a Valid Exploit [PDF, 2026]: https://arxiv.org/abs/2602.04165

FaultLine: Automated Proof-of-Vulnerability Generation Using LLM Agents [PDF, 2025]: https://arxiv.org/abs/2507.15241

Subscribe now

原始链接: https://it4sec.substack.com/p/hacking-using-ai-written-exploits
侵权请联系站方: [email protected]

相关推荐

换一批