Spy vs. spy: How GenAI is powering defenders and attackers

Generative AI (GenAI) is reshaping cybersecurity for both attackers and defenders, but its future capabilities are difficult to measure as techniques and models are evolving rapidly.
Adversaries continue to use GenAI with varying levels of reliance. State-sponsored groups continue to take advantage, while criminal organizations are beginning to benefit from the prevalence of uncensored and unweighted models.
Today, threat actors are using GenAI for coding, phishing, anti-analysis/evasion, and vulnerability discovery. It’s also starting to show up in malware samples, although significant human involvement is still a requirement.
As models continue to shrink and hardware requirements are removed, adversarial access to GenAI and its capabilities are poised to surge.
Defenders can use GenAI as a force multiplier to parse through vast threat data, enhance incident response, and proactively detect code vulnerabilities, helping to overcome analyst shortages.

Generative AI (GenAI) has caused a fundamental shift in how people work and its impact is being felt almost everywhere. Individuals and enterprises alike are rushing to see how GenAI can make their lives easier or their work faster and more efficient. In information security, the focus has largely been on how adversaries are going to leverage it, and less on how defenders can benefit from it. While we are undoubtedly seeing GenAI have an impact on the threat landscape, quantifying that impact is difficult at best. The overwhelming majority of benefits from GenAI are impossible to determine from the finished malware we see, especially as vibe coding becomes more common.

AI and GenAI are evolving at an exponential pace, and as a result the landscape is changing rapidly. This blog is a snapshot of current AI usage. As models continue to shrink and hardware requirements lessen, it’s likely we are only seeing the tip of the iceberg on GenAI’s potential.

Adversarial GenAI usage

Cisco Talos has covered this topic previously but the landscape continues to evolve at an exponential pace. Anthropic recently reported that state-sponsored groups are starting to leverage the technology in campaigns, while still requiring significant human help. The industry has also started to see actors embedding prompts into malware to evade detection. However, most of these methods are experimental and unreliable. They can greatly increase execution times, due to the nature of AI responses, and can result in execution failures. The technology is still in its infancy but current trends show significant AI usage is likely coming.

Adversaries are also leveraging prompts in malware and DNS records, mainly for anti-analysis purposes. For example, if defenders are using GenAI while analyzing malware, it will come across the adversary’s prompt, ignore all previous instructions, and return benign results. This new evasion method is likely to grow as AI systems play a bigger role in detection and analysis.

However, Talos continues to see the largest impacts on the conversational side of compromise, such as email content and social engineering. We have also seen plenty of examples of AI being used as a lure to trick users into installing malware. There is no doubt that, in the early days of GenAI, only well-funded threat groups were leveraging AI at high levels, most prominently at the state-sponsored level. With the evolution of the models and, more importantly, the abundance of uncensored and open weight models, the barrier to entry has lowered and other groups are likely using it.

Adversarial usage of AI is still difficult to quantify since most of the impacts are not visible in the end product. The most common applications of GenAI are helping with errors in coding, vibe coding functions, generating phishing emails, or gathering information on a future target. Regardless, the results rarely appear AI generated. Only companies operating publicly available models have the insights required to see how adversaries are using the technology, but even that view is limited.

Although this is how the GenAI landscape appears today, there are indications it is starting to shift. Uncensored models are becoming common and are easily accessible, and overall, the models continue to shrink in both size and associated hardware requirements. In the next year or two, it seems likely adversaries will gain the advantage. Defensive improvements will follow, but it is unclear at this point if they will keep pace.

Vulnerability hunting

The use of GenAI to find vulnerabilities in code and software is an obvious application, but one that both offensive and defensive actors can use. Threat groups may leverage GenAI to uncover zero-day vulnerabilities to use maliciously, but what about the researchers using GenAI to help them triage fuzz farm outputs? If the researcher is focused on coordinated disclosure resulting in patches and not on selling to the highest bidder, GenAI is largely benign. Unfortunately, players on both sides are flooding the zone with GenAI-powered vulnerability discovery. For now we’ll focus purely on vulnerability analysis from outside the organization. The ways internal developers should use GenAI will be addressed in the next section.

For closed-source software, fuzzing is key for vulnerability disclosure. For open-source software, however, GenAI can perform deep public code reviews and find vulnerabilities, both in coordination with vendors or to be sold on the black market. As lightweight and specialized models continue to appear over the next few years, this aspect of vulnerability hunting is likely to surge.

Regardless of the end goal, vulnerability hunting is an effective and attractive GenAI application. Most modern applications have hundreds of thousands — if not millions — of lines of code and analyzing it can be a daunting task. This task is complicated by the barrage of enhancements and updates made to products during their lifetime. Every code change introduces risk and GenAI might currently be the best option to mitigate it.

Enterprise security applications of GenAI

On the positive side of technology, there is incredible research and innovation underway. One of the biggest challenges in information security is an astronomical volume of data, without enough analysts available to process it. This is where GenAI shines.

The amount of threat intelligence being generated is huge. Historically, there were a handful of vendors producing high-value threat intelligence reporting. That number is likely in the hundreds now. The result is massive amounts of data covering a staggering amount of activity. This is an ideal application for GenAI: Let it parse through the data, pull out what’s important, and help block indicators across your defensive portfolio.

Additionally, when you are in the middle of an incident and have reams of logs to correlate the attack and its impact, GenAI could be a huge advantage. Instead of spending hours poring over the logs, GenAI should be able to quickly and easily identify things like attempted lateral movement, exploitation, and initial access. It might not be a perfect source but will likely point responders to logs that should be further investigated. This allows responders to quickly focus on key points in the timeline and hopefully help mitigate the ongoing damage.

From a proactive perspective, there are a couple of areas where GenAI will benefit defenders. One of the first places an organization should look to implement GenAI is on analyzing committed code. No developer is perfect and humans make mistakes. Sometimes these mistakes can lead to huge incidents and millions or billions of dollars in damages.

Every time code is committed there is a risk that a vulnerability has been introduced. Leveraging GenAI to analyze each commit before they are applied can mitigate some of this risk. Since the LLM will have access to source code, it can more easily spot common mistakes that often result in vulnerabilities. While it may not detect complex attack chains involving chaining together low to medium severity bugs that could achieve remote code execution (RCE), it can still find the obvious mistakes that sometimes evade code reviews.

Red teamers can also utilize GenAI to streamline activities. By using AI to hunt for and exploit vulnerabilities or weaknesses in security posture, they can operate more efficiently. GenAI can provide starting points to jump start their research, allowing for faster prototyping and ultimately success or failure.

GenAI and existing tooling

Talos has already covered how Model Context Protocol (MCP) servers can be leveraged to help in reverse engineering and malware analysis, but this only scratches the surface. MCP servers connect a wide array of applications and datasets to GenAI, providing structured assistance for a variety of tasks. There are countless applications for MCP servers, and we are starting to see more flexible plugins that allow a variety of applications and data sets be accessed via a single plug-in. When combined with agentic AI, this could allow for huge leaps in productivity. MCP servers were also part of the technology stack used by state sponsored adversaries in the abuse covered by Anthropic.

Agentic AI’s impact

The meteoric rise of agentic AI will undoubtedly have an impact on the threat landscape. With agentic AI, adversaries could deploy agents constantly working to compromise new victims, setting up a pipeline for ransomware cartels. They could build agents focused on finding vulnerabilities in new commits to open-source projects or fuzzing various applications while triaging the findings. State-sponsored groups could task agents, who never need a break to eat or sleep, with breaking into high value targets, allowing them to hack until they find a way in, and constantly monitor for changes in attack surface or introduction of new systems.

On the other hand, defenders can use agentic AI as a force multiplier. Now you have some extra analysts that are looking for the slow and low attacks that might slip under your radar. Maybe an agent is tasked with watching windows logs for indications of compromise, lateral movement, and data exfiltration. Yet another agent can monitor the security of your endpoints and flag systems that are at higher risk of compromise due to improper access controls, incomplete patching, or other security concerns. Agents can even protect users from phishing or spam emails, or accidentally clicking on malicious links.

In the end, it all comes down to people

There is one key resource that underpins all of these capabilities: humans. Ultimately, GenAI can complete tasks efficiently and effectively, but only for those that understand the underlying technology. Developers who understand code can use GenAI to increase throughput without sacrificing quality. In contrast, non-experts may struggle to use GenAI tools effectively, producing code they can’t understand or maintain.

Even Anthropic’s recent reporting notes that AI agents still require human assistance to carry out the attacks. The lesson is clear: People with the knowledge can do incredible things with GenAI and those without can accomplish a lot, but the true greatness of GenAI will only be available to those with the underlying knowledge to know what is right and possible with this new and emerging technology.

Cisco Talos Blog – Read More