In the rush to implement national security use cases for artificial intelligence and machine learning, policymakers must ensure they weigh the risks properly, say experts in the field.

Like all software, artificial intelligence (AI)/machine learning (ML) is vulnerable to hacking. But because of the way it must be trained, AI/ML is even more sensitive than most software: it can be successfully attacked even without access to the computer network it is running on.

“There hasn’t been enough attention from policymakers on the risks of AI hacking,” says Andrew Lohn, senior fellow at the Center for Security and Emerging Technology – or CSET – a nonpartisan think tank attached at the Walsh School of Foreign Service at Georgetown University. . “There are people pushing for AI adoption without fully understanding the risks they’re going to have to accept along the way.”

In a series of articles and presentations, Lohn and his CSET colleagues sought to draw attention to the growing body of academic research showing that AI/ML algorithms can easily be attacked in a variety of ways. “The goal is not to stop [the deployment of AI] but to make sure people understand all of what they’re asking for,” he says.

Lohn thinks the threat of AI hacking isn’t just academic or theoretical. White hat hackers have successfully demonstrated real-world attacks on AI-powered self-driving systems such as those used by Tesla cars. Researchers from Chinese e-commerce giant Tencent managed to get the autopilot feature of the car to change lanes in oncoming traffic using inconspicuous stickers on the pavement.

At McAfee Security, researchers have used equally inconspicuous stickers on the speed limit signs to get the car going up to 85 miles in a 35 mph zone.

Based on his interactions with engineers working on commercial AI technology, Lohn believes these or other types of attacks were carried out in the wild by real cyber threat actors. “People are very reluctant to talk about when they were attacked,” he says. “It’s more of a nod and a wink.”

And, of course, successful attacks may go undetected. Neal Ziring, technical director of the Cybersecurity Directorate of the National Security Agency (NSA), told a Billington Cybersecurity Webinar in January that although there is a burgeoning academic literature on how to perform attacks, research into their detection was “much less mature at this point”.

“The most challenging aspect of securing AI/ML,” Ziring said, was the fact that it had a deployment pipeline that attackers could strike at any time.

AI/ML systems need to be trained before deployment using large datasets – images of faces, for example, are used to train facial recognition software. By examining millions of tagged images, AI/ML can be trained to distinguish, for example, cats from dogs.

But it’s this training pipeline that makes AI/ML vulnerable even to attackers who have no access to the network it’s running on.

Data poisoning attacks work by seeding specially crafted images into AI/ML training sets, which in some cases have been removed from the public internet or harvested from social media or other platforms.

“If you don’t know where these pieces of data come from,” Ziring warned, “if you don’t carefully track their provenance and integrity, you could allow inappropriate or malicious examples to sneak in.”

While indistinguishable to human eyes from an authentic image, poisoned images contain data that can cause AI/ML to misidentify entire categories of items. “The mathematics of these systems, depending on the type of model you use, can be very sensitive to changes in how recognition or classification is performed, even based on a small number of training items,” explained Ziring.

Indeed, according to a presentation last year by Stanford professor of cryptography Dan Boneh, a single corrupted image in a training set can be enough to poison an algorithm and cause it to mistakenly identify thousands of pictures.

Poisoned images can be made in different ways, Boneh explained, demonstrating a technique known as fast gradient sign method, or FGSM, which identifies key data points in training images. By using FGSM, an attacker can make pixel-level changes called “disturbances” in an image. Although invisible to the human eye, these disturbances turn the image into a “contradictory example”, providing data inputs that cause the AI/ML to misidentify it by fooling the model it is using.

“The model is trained to do one thing, but we can confuse it by submitting a carefully crafted contradictory example,” Boneh said.

Attacks using FGSM are typically “white box” attacks, where the attacker has access to the AI/ML source code. White box attacks can be carried out on open source AI/ML, of which there is a rapidly growing library on GitHub and other open source repositories.

But academics have also demonstrated many “black box” data poisoning attacks, where the attacker only has access to inputs, training data, and outputs – the ability to see how the system classifies images incoming.

Indeed, the Defense Advanced Research Projects Agency (DARPA) claims that “rapidly proliferating” academic research in the field is “characterized by increasingly complex attacks that require less and less knowledge about the ML system under attack, while proving increasingly strong against defensive countermeasures”. .” The agency’s Ensuring AI robustness against deception (GARD) is responsible for “developing the theoretical foundations of defensible ML” and “creating and testing defensible systems”.

GARD builds a testbed against which the robustness and defense of AI/ML systems can be measured under real-world threat scenarios. The goal? “Create deception-resistant ML technologies with strict criteria to evaluate their robustness.”

The proliferation of open-source AI/ML tools, including training sets of data of uncertain provenance, opens the door to software supply chain attacks, as well as data poisoning, Lohn points out of the CSET.

Malicious contributors have successfully introduced malicious coding into open source projects, so this is not a purely hypothetical threat. “AI is a battleground for great powers. Machine learning applications are increasingly valuable targets for highly sophisticated cyber adversaries,” he says.

One of the types of attacks that national security agencies are most concerned about is the so-called extraction attack. Mining is a black box attack technique used to reverse engineer AI/ML models or to gain insight into the data used to train them.

NSA’s Ziring explained mining attacks this way: “If you’re a government agency, you put a lot of effort into training your model, maybe you used very sensitive data to train it. .. an attacker could attempt to interrogate your model in a mathematically guided way in order to extract facts about the model, its behavior, or the data that was used to train it.If the data used to train it was very sensitive, proprietary, non-public, you don’t want that to happen.

A public-private partnership model should be used to combat AI/ML threats, Ziring said. “Multi-stakeholder organizations are exactly the right place to pursue this. Law? Because you’re going to get that diversity of perspective and different insights from different stakeholders that can help inform a security consensus…needed to secure AI/ML systems.

“Attacks will happen in the real world,” warns Professor Bo Li of the University of Illinois at Urbana-Champaign, saying it’s important to get ahead of the threat. “I would say it’s important to study this now rather than later because otherwise it’s even more expensive to develop these defense algorithms once we’ve already deployed the models.”

Incorporating “design principles like explainability and human-style knowledge and reasoning ability like domain knowledge” could help, Li says. In the beginning, their operators and defenders get tools they can use to build defenses, she explains.

“We need to understand why the model is giving us this prediction, what quantitative or qualitative metrics I can trace back to know what information triggered this prediction, and if I know this prediction is wrong or has a problem, then I can understand what might caused the problem.