Evaluation of adversarial attacks sensitivity of classifiers with occluded input data

Sooksatra, Korn; Rivas, Pablo

doi:10.1007/s00521-022-07387-y

Evaluation of adversarial attacks sensitivity of classifiers with occluded input data

Original Article
Published: 01 June 2022

Volume 34, pages 17615–17632, (2022)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

245 Accesses
3 Citations
Explore all metrics

Abstract

With the noteworthy achievements of deep learning models, there are transformative applications that aim at cost reduction and the improvement in human quality of life. Nevertheless, recent work aimed at testing a classifier’s ability to withstand targeted and black-box adversarial attacks demonstrated that deep learning models, in particular, are brittle and lack certain robustness that makes them particularly weak, and ultimately leading to a lack of trust. For this specific area, a question arises concerning certain regions’ sensitivity in the input space against adversarial perturbations for a classification model. This paper aims to study such a problem by looking into a Sensitivity-inspired Constrained Evaluation Method (SICEM) to deterministically evaluate how much a region of the input space is vulnerable to adversarial perturbations compared to other regions and also the entire input space. Our experiments suggest that SICEM can accurately quantify region vulnerabilities on MNIST and CIFAR-10 datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Exploring misclassifications of robust neural networks to enhance adversarial attacks

Article Open access 21 March 2023

Improving Adversarial Robustness by Penalizing Natural Accuracy

Adversarial Deep Learning with Stackelberg Games

Availability of data and material

For our experiments, we use MNIST and CIFAR-10 data, which are publicly available.

Code availability

Code is made available as explained in Appendix A.

References

Abadi M, Andersen DG (2016) Learning to protect communications with adversarial neural cryptography. arXiv preprint arXiv:1610.06918
Browne MW (2000) Cross-validation methods. J Math Psychol 44(1):108–132
Article MathSciNet Google Scholar
Carlini N, Athalye A, Papernot N, Brendel W, Rauber J, Tsipras D, Goodfellow I, Madry A, Kurakin A (2019) On evaluating adversarial robustness. arXiv preprint arXiv:1902.06705
Carlini N, Wagner D (2017) Towards evaluating the robustness of neural networks. In: 2017 IEEE symposium on security and privacy (sp), pp. 39–57. IEEE
Cohen J, Rosenfeld E, Kolter Z (2019) Certified adversarial robustness via randomized smoothing. In: international conference on machine learning, pp. 1310–1320. PMLR
Coutinho M, de Oliveira Albuquerque R, Borges F, Garcia Villalba LJ, Kim TH (2018) Learning perfectly secure cryptography to protect communications with adversarial neural cryptography. Sensors 18(5):1306
Article Google Scholar
Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L (2009) Imagenet: a large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition, pp. 248–255. IEEE
Dong Y, Fu QA, Yang X, Pang T, Su H, Xiao Z, Zhu J (2020) Benchmarking adversarial robustness on image classification. In: proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 321–331
Etmann C, Lunz S, Maass P, Schönlieb CB (2019) On the connection between adversarial robustness and saliency map interpretability. arXiv preprint arXiv:1905.04172
Eykholt K, Evtimov I, Fernandes E, Li B, Rahmati A, Xiao C, Prakash A, Kohno T, Song D (2018) Robust physical-world attacks on deep learning visual classification. In: proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1625–1634
Goodfellow IJ, Shlens J, Szegedy C (2014) Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572
Grosse K, Papernot N, Manoharan P, Backes M, McDaniel P (2017) Adversarial examples for malware detection. In: European symposium on research in computer security, pp. 62–79. Springer
Hardy W, Chen L, Hou S, Ye Y, Li X (2016) Dl4md: A deep learning framework for intelligent malware detection. In: proceedings of the international conference on data mining (DMIN), p. 61. The steering committee of the World congress in computer science
Ilyas A, Santurkar S, Tsipras D, Engstrom L, Tran B, Madry A (2019) Adversarial examples are not bugs, they are features. In: Advances in neural information processing systems, pp. 125–136
Ioffe S, Szegedy C (2015) Batch normalization: Accelerating deep network training by reducing internal covariate shift. In: international conference on machine learning, pp. 448–456. PMLR
Kos J, Fischer I, Song D (2018) Adversarial examples for generative models. In: 2018 IEEE security and privacy workshops (spw), pp. 36–42. IEEE
Krizhevsky A, Hinton G, et al. (2009) Learning multiple layers of features from tiny images
Kurakin A, Goodfellow I, Bengio S (2016) Adversarial examples in the physical world. arXiv preprint arXiv:1607.02533
LeCun Y (1998) The mnist database of handwritten digits. http://yann. lecun. com/exdb/mnist/
Lopez MM, Kalita J (2017) Deep learning applied to nlp. arXiv preprint arXiv:1703.03091
Madry A, Makelov A, Schmidt L, Tsipras D, Vladu A (2017) Towards deep learning models resistant to adversarial attacks. arXiv preprint arXiv:1706.06083
Mangla P, Singh V, Balasubramanian VN (2020) On saliency maps and adversarial robustness. In: joint European conference on machine learning and knowledge discovery in databases, pp. 272–288. Springer
Papernot N, McDaniel P, Jha S, Fredrikson M, Celik ZB, Swami A (2016) The limitations of deep learning in adversarial settings. In: 2016 IEEE European symposium on security and privacy (EuroS &P), pp. 372–387. IEEE
Papernot N, McDaniel P, Wu X, Jha S, Swami A (2016) Distillation as a defense to adversarial perturbations against deep neural networks. In: 2016 IEEE symposium on security and privacy (SP), pp. 582–597. IEEE
Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958
MathSciNet MATH Google Scholar
Su J, Vargas DV, Sakurai K (2019) One pixel attack for fooling deep neural networks. IEEE Trans Evol Comput 23(5):828–841
Article Google Scholar
Sun Y, Yin J, Wu C, Zheng K, Niu X (2021) Generating facial expression adversarial examples based on saliency map. Image Vis Comput 116:104318
Article Google Scholar
Szegedy C, Zaremba W, Sutskever I, Bruna J, Erhan D, Goodfellow I, Fergus R (2013) Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199
Tabacof P, Tavares J, Valle E (2016) Adversarial images for variational autoencoders. arXiv preprint arXiv:1612.00155
Theis L, Shi W, Cunningham A, Huszár F (2017) Lossy image compression with compressive autoencoders. arXiv preprint arXiv:1703.00395
Vinayakumar R, Alazab M, Soman K, Poornachandran P, Venkatraman S (2019) Robust intelligent malware detection using deep learning. IEEE Access 7:46717–46738
Article Google Scholar
Voulodimos A, Doulamis N, Doulamis A, Protopapadakis E (2018) Deep learning for computer vision: a brief review. Comput Intell Neurosci 2018
Wang S, Gong Y (2021) Adversarial example detection based on saliency map features. Appl Intell pp. 1–14
Wang TC, Liu MY, Zhu JY, Liu G, Tao A, Kautz J, Catanzaro B (2018) Video-to-video synthesis. arXiv preprint arXiv:1808.06601
Xin Y, Kong L, Liu Z, Chen Y, Li Y, Zhu H, Gao M, Hou H, Wang C (2018) Machine learning and deep learning methods for cybersecurity. IEEE Access 6:35365–35381
Article Google Scholar
Xu W, Evans D, Qi, Y (2017) Feature squeezing: Detecting adversarial examples in deep neural networks. arXiv preprint arXiv:1704.01155
Yuan Z, Lu Y, Xue Y (2016) Droiddetector: android malware characterization and detection using deep learning. Tsinghua Sci Technol 21(1):114–123
Article Google Scholar
Zheng S, Song Y, Leung T, Goodfellow I (2016) Improving the robustness of deep neural networks via stability training. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4480–4488

Download references

Funding

This material is based upon work supported by the National Science Foundation under Grant CHE-1905043 and CNS-2136961.

Author information

Authors and Affiliations

Baylor University, One Bear Place #97141, Waco, Texas, 76798-7141, USA
Korn Sooksatra & Pablo Rivas

Authors

Korn Sooksatra
View author publications
You can also search for this author in PubMed Google Scholar
Pablo Rivas
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

K. Sooksatra executed the research. P. Rivas directed the research.

Corresponding author

Correspondence to Pablo Rivas.

Ethics declarations

Conflict of interest

The authors have no conflicts of interest to declare that are relevant to the content of this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

A Code to reproduce experiments

The code to reproduce all the experiments in this paper can be found in the attached supplementary zip file entitled:

sicem.zip

This .zip includes the weights for MNIST and CIFAR-10 trained classifiers; please change the path variable in the code to your working space to avoid overwriting the pre-trained classifiers. Also, beware that training the deep convolutional network, calculating the success rate, and individual-image agreement section consume a significant amount of time. Please make sure you have sufficient time. However, all the results are already shown in the python notebooks, and you do not need to re-run all the experiments. Nonetheless, if you do want to re-run everything, please follow our advice above (Figs. 12, 13).

B Architectures of MNIST and CIFAR-10 classifiers

C Success rate and average required \(\epsilon \)

Tables 3 and 4 show the success rates and average required \(\epsilon \), respectively, to find adversarial examples after performing the adversarial attacks on 200 images of MNIST dataset where t-attack denotes that the adversary performs attack with mask \(M_t\) and b-attack denotes that the adversary performs attack with mask \(M_b\). Similarly, Tables 5 and 6 shows the success rates and average required \(\epsilon \), respectively, after performing the attacks on 200 images of CIFAR-10 dataset.

Table 3 Success rate of each attack on MNIST

Full size table

Table 4 Average \(\epsilon \) of successful attacks on MNIST

Full size table

Table 5 Success rate of each attack on CIFAR-10

Full size table

Table 6 Average \(\epsilon \) of successful attacks on CIFAR-10

Full size table

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sooksatra, K., Rivas, P. Evaluation of adversarial attacks sensitivity of classifiers with occluded input data. Neural Comput & Applic 34, 17615–17632 (2022). https://doi.org/10.1007/s00521-022-07387-y

Download citation

Received: 21 July 2021
Accepted: 28 April 2022
Published: 01 June 2022
Issue Date: October 2022
DOI: https://doi.org/10.1007/s00521-022-07387-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Evaluation of adversarial attacks sensitivity of classifiers with occluded input data

Abstract

Access this article

Similar content being viewed by others

Exploring misclassifications of robust neural networks to enhance adversarial attacks

Improving Adversarial Robustness by Penalizing Natural Accuracy

Adversarial Deep Learning with Stackelberg Games

Availability of data and material

Code availability

References

Funding