Attacking convolutional neural network using differential evolution

研究成果: ジャーナルへの寄稿記事

2 引用 (Scopus)

抄録

The output of convolutional neural networks (CNNs) has been shown to be discontinuous which can make the CNN image classifier vulnerable to small well-tuned artificial perturbation. That is, images modified by conducting such alteration (i.e., adversarial perturbation) that make little difference to the human eyes can completely change the CNN classification results. In this paper, we propose a practical attack using differential evolution (DE) for generating effective adversarial perturbations. We comprehensively evaluate the effectiveness of different types of DEs for conducting the attack on different network structures. The proposed method only modifies five pixels (i.e., few-pixel attack), and it is a black-box attack which only requires the miracle feedback of the target CNN systems. The results show that under strict constraints which simultaneously control the number of pixels changed and overall perturbation strength, attacking can achieve 72.29%, 72.30%, and 61.28% non-targeted attack success rates, with 88.68%, 83.63%, and 73.07% confidence on average, on three common types of CNNs. The attack only requires modifying five pixels with 20.44, 14.28, and 22.98 pixel value distortion. Thus, we show that current deep neural networks are also vulnerable to such simpler black-box attacks even under very limited attack conditions.

元の言語英語
記事番号1
ジャーナルIPSJ Transactions on Computer Vision and Applications
11
発行部数1
DOI
出版物ステータス出版済み - 2 1 2019

Fingerprint

Pixels
Neural networks
Classifiers
Feedback

All Science Journal Classification (ASJC) codes

  • Computer Vision and Pattern Recognition

これを引用

@article{db03197abed84e2fac8cba7d975e83ee,
title = "Attacking convolutional neural network using differential evolution",
abstract = "The output of convolutional neural networks (CNNs) has been shown to be discontinuous which can make the CNN image classifier vulnerable to small well-tuned artificial perturbation. That is, images modified by conducting such alteration (i.e., adversarial perturbation) that make little difference to the human eyes can completely change the CNN classification results. In this paper, we propose a practical attack using differential evolution (DE) for generating effective adversarial perturbations. We comprehensively evaluate the effectiveness of different types of DEs for conducting the attack on different network structures. The proposed method only modifies five pixels (i.e., few-pixel attack), and it is a black-box attack which only requires the miracle feedback of the target CNN systems. The results show that under strict constraints which simultaneously control the number of pixels changed and overall perturbation strength, attacking can achieve 72.29{\%}, 72.30{\%}, and 61.28{\%} non-targeted attack success rates, with 88.68{\%}, 83.63{\%}, and 73.07{\%} confidence on average, on three common types of CNNs. The attack only requires modifying five pixels with 20.44, 14.28, and 22.98 pixel value distortion. Thus, we show that current deep neural networks are also vulnerable to such simpler black-box attacks even under very limited attack conditions.",
author = "Jiawei Su and Vargas, {Danilo Vasconcellos} and Kouichi Sakurai",
year = "2019",
month = "2",
day = "1",
doi = "10.1186/s41074-019-0053-3",
language = "English",
volume = "11",
journal = "IPSJ Transactions on Computer Vision and Applications",
issn = "1882-6695",
publisher = "Information Processing Society of Japan",
number = "1",

}

TY - JOUR

T1 - Attacking convolutional neural network using differential evolution

AU - Su, Jiawei

AU - Vargas, Danilo Vasconcellos

AU - Sakurai, Kouichi

PY - 2019/2/1

Y1 - 2019/2/1

N2 - The output of convolutional neural networks (CNNs) has been shown to be discontinuous which can make the CNN image classifier vulnerable to small well-tuned artificial perturbation. That is, images modified by conducting such alteration (i.e., adversarial perturbation) that make little difference to the human eyes can completely change the CNN classification results. In this paper, we propose a practical attack using differential evolution (DE) for generating effective adversarial perturbations. We comprehensively evaluate the effectiveness of different types of DEs for conducting the attack on different network structures. The proposed method only modifies five pixels (i.e., few-pixel attack), and it is a black-box attack which only requires the miracle feedback of the target CNN systems. The results show that under strict constraints which simultaneously control the number of pixels changed and overall perturbation strength, attacking can achieve 72.29%, 72.30%, and 61.28% non-targeted attack success rates, with 88.68%, 83.63%, and 73.07% confidence on average, on three common types of CNNs. The attack only requires modifying five pixels with 20.44, 14.28, and 22.98 pixel value distortion. Thus, we show that current deep neural networks are also vulnerable to such simpler black-box attacks even under very limited attack conditions.

AB - The output of convolutional neural networks (CNNs) has been shown to be discontinuous which can make the CNN image classifier vulnerable to small well-tuned artificial perturbation. That is, images modified by conducting such alteration (i.e., adversarial perturbation) that make little difference to the human eyes can completely change the CNN classification results. In this paper, we propose a practical attack using differential evolution (DE) for generating effective adversarial perturbations. We comprehensively evaluate the effectiveness of different types of DEs for conducting the attack on different network structures. The proposed method only modifies five pixels (i.e., few-pixel attack), and it is a black-box attack which only requires the miracle feedback of the target CNN systems. The results show that under strict constraints which simultaneously control the number of pixels changed and overall perturbation strength, attacking can achieve 72.29%, 72.30%, and 61.28% non-targeted attack success rates, with 88.68%, 83.63%, and 73.07% confidence on average, on three common types of CNNs. The attack only requires modifying five pixels with 20.44, 14.28, and 22.98 pixel value distortion. Thus, we show that current deep neural networks are also vulnerable to such simpler black-box attacks even under very limited attack conditions.

UR - http://www.scopus.com/inward/record.url?scp=85062046664&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85062046664&partnerID=8YFLogxK

U2 - 10.1186/s41074-019-0053-3

DO - 10.1186/s41074-019-0053-3

M3 - Article

AN - SCOPUS:85062046664

VL - 11

JO - IPSJ Transactions on Computer Vision and Applications

JF - IPSJ Transactions on Computer Vision and Applications

SN - 1882-6695

IS - 1

M1 - 1

ER -