AI 'vaccinated' against attacks
Some of Australia’s top tech specialists have developed a set of techniques to effectively ‘vaccinate’ against attacks on machine learning.
Algorithms ‘learn’ from the data they are trained on to perform a given task effectively without needing specific instructions. These techniques are already used widely, for example to identify spam emails, diagnose diseases from X-rays, predict crop yields and will soon drive our cars.
While machine learning holds enormous potential to positively transform our world, the tools are vulnerable to adversarial attacks, it is possible to fool machine learning models through the input of malicious data causing them to malfunction.
Dr Richard Nock, machine learning group leader at CSIRO’s Data61, says that by adding a layer of noise over an image, creating an ‘adversary’ for the algorithm, attackers can deceive machine learning models into misclassifying the image.
“Adversarial attacks have proven capable of tricking a machine learning model into incorrectly labelling a traffic stop sign as speed sign, which could have disastrous effects in the real world,” he said.
“Our new techniques prevent adversarial attacks using a process similar to vaccination.
“We implement a weak version of an adversary, such as small modifications or distortion to a collection of images, to create a more ‘difficult’ training data set. When the algorithm is trained on data exposed to a small dose of distortion, the resulting model is more robust and immune to adversarial attacks.”
In a new paper, the researchers have demonstrated ‘vaccination’ techniques built from the worst possible adversarial examples, which can withstand very strong attacks.
Adrian Turner, CEO at CSIRO’s Data61 said this research is a significant contribution to the growing field of adversarial machine learning.
“Artificial intelligence and machine learning can help solve some of the world’s greatest social, economic and environmental challenges, but that can’t happen without focused research into these technologies,” he said.
“The new techniques against adversarial attacks developed at Data61 will spark a new line of machine learning research and ensure the positive use of transformative AI technologies.”
The research paper is accessible here - Monge blunts Bayes: Hardness Results for Adversarial Training.