Skip to main content
Log in

Evaluation of adversarial attacks sensitivity of classifiers with occluded input data

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

With the noteworthy achievements of deep learning models, there are transformative applications that aim at cost reduction and the improvement in human quality of life. Nevertheless, recent work aimed at testing a classifier’s ability to withstand targeted and black-box adversarial attacks demonstrated that deep learning models, in particular, are brittle and lack certain robustness that makes them particularly weak, and ultimately leading to a lack of trust. For this specific area, a question arises concerning certain regions’ sensitivity in the input space against adversarial perturbations for a classification model. This paper aims to study such a problem by looking into a Sensitivity-inspired Constrained Evaluation Method (SICEM) to deterministically evaluate how much a region of the input space is vulnerable to adversarial perturbations compared to other regions and also the entire input space. Our experiments suggest that SICEM can accurately quantify region vulnerabilities on MNIST and CIFAR-10 datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Availability of data and material

For our experiments, we use MNIST and CIFAR-10 data, which are publicly available.

Code availability

Code is made available as explained in Appendix A.

References

  1. Abadi M, Andersen DG (2016) Learning to protect communications with adversarial neural cryptography. arXiv preprint arXiv:1610.06918

  2. Browne MW (2000) Cross-validation methods. J Math Psychol 44(1):108–132

    Article  MathSciNet  Google Scholar 

  3. Carlini N, Athalye A, Papernot N, Brendel W, Rauber J, Tsipras D, Goodfellow I, Madry A, Kurakin A (2019) On evaluating adversarial robustness. arXiv preprint arXiv:1902.06705

  4. Carlini N, Wagner D (2017) Towards evaluating the robustness of neural networks. In: 2017 IEEE symposium on security and privacy (sp), pp. 39–57. IEEE

  5. Cohen J, Rosenfeld E, Kolter Z (2019) Certified adversarial robustness via randomized smoothing. In: international conference on machine learning, pp. 1310–1320. PMLR

  6. Coutinho M, de Oliveira Albuquerque R, Borges F, Garcia Villalba LJ, Kim TH (2018) Learning perfectly secure cryptography to protect communications with adversarial neural cryptography. Sensors 18(5):1306

    Article  Google Scholar 

  7. Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L (2009) Imagenet: a large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition, pp. 248–255. IEEE

  8. Dong Y, Fu QA, Yang X, Pang T, Su H, Xiao Z, Zhu J (2020) Benchmarking adversarial robustness on image classification. In: proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 321–331

  9. Etmann C, Lunz S, Maass P, Schönlieb CB (2019) On the connection between adversarial robustness and saliency map interpretability. arXiv preprint arXiv:1905.04172

  10. Eykholt K, Evtimov I, Fernandes E, Li B, Rahmati A, Xiao C, Prakash A, Kohno T, Song D (2018) Robust physical-world attacks on deep learning visual classification. In: proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1625–1634

  11. Goodfellow IJ, Shlens J, Szegedy C (2014) Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572

  12. Grosse K, Papernot N, Manoharan P, Backes M, McDaniel P (2017) Adversarial examples for malware detection. In: European symposium on research in computer security, pp. 62–79. Springer

  13. Hardy W, Chen L, Hou S, Ye Y, Li X (2016) Dl4md: A deep learning framework for intelligent malware detection. In: proceedings of the international conference on data mining (DMIN), p. 61. The steering committee of the World congress in computer science

  14. Ilyas A, Santurkar S, Tsipras D, Engstrom L, Tran B, Madry A (2019) Adversarial examples are not bugs, they are features. In: Advances in neural information processing systems, pp. 125–136

  15. Ioffe S, Szegedy C (2015) Batch normalization: Accelerating deep network training by reducing internal covariate shift. In: international conference on machine learning, pp. 448–456. PMLR

  16. Kos J, Fischer I, Song D (2018) Adversarial examples for generative models. In: 2018 IEEE security and privacy workshops (spw), pp. 36–42. IEEE

  17. Krizhevsky A, Hinton G, et al. (2009) Learning multiple layers of features from tiny images

  18. Kurakin A, Goodfellow I, Bengio S (2016) Adversarial examples in the physical world. arXiv preprint arXiv:1607.02533

  19. LeCun Y (1998) The mnist database of handwritten digits. http://yann. lecun. com/exdb/mnist/

  20. Lopez MM, Kalita J (2017) Deep learning applied to nlp. arXiv preprint arXiv:1703.03091

  21. Madry A, Makelov A, Schmidt L, Tsipras D, Vladu A (2017) Towards deep learning models resistant to adversarial attacks. arXiv preprint arXiv:1706.06083

  22. Mangla P, Singh V, Balasubramanian VN (2020) On saliency maps and adversarial robustness. In: joint European conference on machine learning and knowledge discovery in databases, pp. 272–288. Springer

  23. Papernot N, McDaniel P, Jha S, Fredrikson M, Celik ZB, Swami A (2016) The limitations of deep learning in adversarial settings. In: 2016 IEEE European symposium on security and privacy (EuroS &P), pp. 372–387. IEEE

  24. Papernot N, McDaniel P, Wu X, Jha S, Swami A (2016) Distillation as a defense to adversarial perturbations against deep neural networks. In: 2016 IEEE symposium on security and privacy (SP), pp. 582–597. IEEE

  25. Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958

    MathSciNet  MATH  Google Scholar 

  26. Su J, Vargas DV, Sakurai K (2019) One pixel attack for fooling deep neural networks. IEEE Trans Evol Comput 23(5):828–841

    Article  Google Scholar 

  27. Sun Y, Yin J, Wu C, Zheng K, Niu X (2021) Generating facial expression adversarial examples based on saliency map. Image Vis Comput 116:104318

    Article  Google Scholar 

  28. Szegedy C, Zaremba W, Sutskever I, Bruna J, Erhan D, Goodfellow I, Fergus R (2013) Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199

  29. Tabacof P, Tavares J, Valle E (2016) Adversarial images for variational autoencoders. arXiv preprint arXiv:1612.00155

  30. Theis L, Shi W, Cunningham A, Huszár F (2017) Lossy image compression with compressive autoencoders. arXiv preprint arXiv:1703.00395

  31. Vinayakumar R, Alazab M, Soman K, Poornachandran P, Venkatraman S (2019) Robust intelligent malware detection using deep learning. IEEE Access 7:46717–46738

    Article  Google Scholar 

  32. Voulodimos A, Doulamis N, Doulamis A, Protopapadakis E (2018) Deep learning for computer vision: a brief review. Comput Intell Neurosci 2018

  33. Wang S, Gong Y (2021) Adversarial example detection based on saliency map features. Appl Intell pp. 1–14

  34. Wang TC, Liu MY, Zhu JY, Liu G, Tao A, Kautz J, Catanzaro B (2018) Video-to-video synthesis. arXiv preprint arXiv:1808.06601

  35. Xin Y, Kong L, Liu Z, Chen Y, Li Y, Zhu H, Gao M, Hou H, Wang C (2018) Machine learning and deep learning methods for cybersecurity. IEEE Access 6:35365–35381

    Article  Google Scholar 

  36. Xu W, Evans D, Qi, Y (2017) Feature squeezing: Detecting adversarial examples in deep neural networks. arXiv preprint arXiv:1704.01155

  37. Yuan Z, Lu Y, Xue Y (2016) Droiddetector: android malware characterization and detection using deep learning. Tsinghua Sci Technol 21(1):114–123

    Article  Google Scholar 

  38. Zheng S, Song Y, Leung T, Goodfellow I (2016) Improving the robustness of deep neural networks via stability training. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4480–4488

Download references

Funding

This material is based upon work supported by the National Science Foundation under Grant CHE-1905043 and CNS-2136961.

Author information

Authors and Affiliations

Authors

Contributions

K. Sooksatra executed the research. P. Rivas directed the research.

Corresponding author

Correspondence to Pablo Rivas.

Ethics declarations

Conflict of interest

The authors have no conflicts of interest to declare that are relevant to the content of this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

A Code to reproduce experiments

The code to reproduce all the experiments in this paper can be found in the attached supplementary zip file entitled:

sicem.zip

This .zip includes the weights for MNIST and CIFAR-10 trained classifiers; please change the path variable in the code to your working space to avoid overwriting the pre-trained classifiers. Also, beware that training the deep convolutional network, calculating the success rate, and individual-image agreement section consume a significant amount of time. Please make sure you have sufficient time. However, all the results are already shown in the python notebooks, and you do not need to re-run all the experiments. Nonetheless, if you do want to re-run everything, please follow our advice above (Figs. 12, 13).

B Architectures of MNIST and CIFAR-10 classifiers

Fig. 12
figure 12

Architecture of our MNIST classifier. Note that a dense layer and a convolutional layer are defined in the red rectangle

Fig. 13
figure 13

Architecture of our CIFAR-10 classifier. Note that a dense layer and a convolutional layer are defined in the red rectangle in Fig. 12

C Success rate and average required \(\epsilon \)

Tables 3 and 4 show the success rates and average required \(\epsilon \), respectively, to find adversarial examples after performing the adversarial attacks on 200 images of MNIST dataset where t-attack denotes that the adversary performs attack with mask \(M_t\) and b-attack denotes that the adversary performs attack with mask \(M_b\). Similarly, Tables 5 and 6 shows the success rates and average required \(\epsilon \), respectively, after performing the attacks on 200 images of CIFAR-10 dataset.

Table 3 Success rate of each attack on MNIST
Table 4 Average \(\epsilon \) of successful attacks on MNIST
Table 5 Success rate of each attack on CIFAR-10
Table 6 Average \(\epsilon \) of successful attacks on CIFAR-10

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sooksatra, K., Rivas, P. Evaluation of adversarial attacks sensitivity of classifiers with occluded input data. Neural Comput & Applic 34, 17615–17632 (2022). https://doi.org/10.1007/s00521-022-07387-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-022-07387-y

Keywords

Navigation