Of all the potential nightmares about the dangerous effects of generative AI (genAI) tools like OpenAI’s ChatGPT and Microsoft’s Copilot, one is near the top of the list: their use by hackers to craft hard-to-detect malicious code. Even worse is the fear that genAI could help rogue states like Russia, Iran, and North Korea unleash unstoppable cyberattacks against the US and its allies.
The bad news: nation states have already begun using genAI to attack the US and its friends. The good news: so far, the attacks haven’t been particularly dangerous or especially effective. Even better news: Microsoft and OpenAI are taking the threat seriously. They’re being transparent about it, openly describing the attacks and sharing what can be done about them.
That said, AI-aided hacking is still in its infancy. And even if genAI is never able to write sophisticated malware, it can be used to make existing hacking techniques far more effective — especially social engineering ones like spear phishing and the theft of passwords and identities to break into even the most hardened systems.
The genAI attacks to date
Microsoft and OpenAI recently revealed a spate of genAI-created attacks and detailed how the companies have been fighting them. (The attacks were based on OpenAI’s ChatGPT, which is also the basis for Microsoft’s Copilot; Microsoft has invested $13 billion in OpenAI.)
OpenAI explained in a blog post that the company has disrupted hacking attempts from five “state-affiliated malicious actors” — Charcoal Typhoon and Salmon Typhoon, connected to China; Crimson Sandstorm, connected to Iran; Emerald Sleet, connected to North Korea; and Forest Blizzard, connected to Russia.
Overall, OpenAI said, the groups used “OpenAI services for querying open-source information, translating, finding coding errors, and running basic coding tasks.”
It’s all fairly garden-variety hacking, according to the company. For example, Charcoal Typhoon used OpenAI services to “research various companies and cybersecurity tools, debug code and generate scripts, and create content likely for use in phishing campaigns.” Forest Blizzard used them “for open-source research into satellite communication protocols and radar imaging technology, as well as for support with scripting tasks.” And Crimson Sandstorm used them for “scripting support related to app and web development, generating content likely for spear-phishing campaigns, and researching common ways malware could evade detection.”
In other words, we’ve not yet seen supercharged coding, new techniques for evading detection, or serious advances of any kind, really. Mainly OpenAI’s tools have been used to help and support existing malware and hacking campaigns.
“The activities of these actors are consistent with previous red team assessments we conducted in partnership with external cybersecurity experts, which found that GPT-4 offers only limited, incremental capabilities for malicious cybersecurity tasks beyond what is already achievable with publicly available, non-AI powered tools,” OpenAI concluded.
Microsoft in a separate blog post echoed OpenAI, offered more details, and laid out the framework the company is using to fight the hacking: “Microsoft and OpenAI have not yet observed particularly novel or unique AI-enabled attack or abuse techniques resulting from threat actors’ usage of AI.”
And now, the bad news…
That’s all good to hear, as is the decision by Microsoft and OpenAI to be so transparent about genAI hacking dangers and their efforts to combat them. But remember, genAI is still in its infancy. Don’t be surprised if eventually this technology becomes capable of building far more effective malware and hacking tools.
Even if that never happens, there’s plenty to worry about. Because genAI can make existing techniques far more powerful. A dirty little secret of hacking is that many of the most successful and dangerous attacks have nothing to do with the quality of the code hackers use. Instead, they turn to “social engineering” — convincing people to hand over passwords or other identifying information that can be used to break into systems and wreak havoc.
That’s how the group Fancy Bear, associated with the Russian government, hacked Hilary Clinton’s campaign during the 2016 presidential election, stole her emails, and eventually made them public. The group sent an email to the personal Gmail account of campaign chairman John Podesta, convinced him it was sent by Google, and told him he needed to change his password. He clicked a malicious link, the hackers stole his password, and then used those credentials to break into the campaign network.
Perhaps the most effective social engineering technique is “spear phishing,” crafting emails or making phone calls to specific people that contain information that only they likely know. That’s where genAI shines. State-sponsored hacker groups often don’t have a good grasp of English, and their spear-phishing emails can sound inauthentic. But they can now use ChatGPT or Copilot to write far more convincing emails.
In fact, they’re already doing it. And they’re doing even worse.
As security company SlashNext explains, there’s already a toolkit circulating called WormGPT, a genAI tool “designed specifically for malicious activities.”
The site got its hands on the tool and tested it. It asked WormGPT to craft an email “intended to pressure an unsuspecting account manager into paying a fraudulent invoice.”
According to SlashNext, “the results were unsettling. WormGPT produced an email that was not only remarkably persuasive but also strategically cunning, showcasing its potential for sophisticated phishing and BEC [business email compromise] attacks. In summary, it’s similar to ChatGPT, but has no ethical boundaries or limitations. This experiment underscores the significant threat posed by generative AI technologies like WormGPT, even in the hands of novice cybercriminals.”
Even that falls far short of what genAI can do. It can create fake photos and fake videos, which can be used to make spear-phishing attacks more persuasive. It can supercharge internet searches to more easily find personal information about people. It can imitate people’s voices. (Imagine getting a phone call from someone who sounds like your boss or someone in IT. You’re likely to do whatever you’re told to do.)
All this is possible today. In fact, according to SlashNext, the launch of ChatGPT has led to a 1,265% increase in phishing emails, “signaling a new era of cybercrime fueled by generative AI,” in the company’s words.
And that means, despite the considerable work OpenAI and Microsoft are doing to fight genAI-powered hacking, timeworn attacks — spear phishing and other social engineering techniques — may be the biggest genAI hacking danger we face for some time to come.