Explainable AI for Inspecting Adversarial Attacks on Deep Neural Networks

Klawikowska, Zuzanna; Mikołajczyk, Agnieszka; Grochowski, Michał

doi:10.1007/978-3-030-61401-0_14

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12415))

Included in the following conference series:

International Conference on Artificial Intelligence and Soft Computing

2488 Accesses
3 Citations

Abstract

Deep Neural Networks (DNN) are state of the art algorithms for image classification. Although significant achievements and perspectives, deep neural networks and accompanying learning algorithms have some important challenges to tackle. However, it appears that it is relatively easy to attack and fool with well-designed input samples called adversarial examples. Adversarial perturbations are unnoticeable for humans. Such attacks are a severe threat to the development of these systems in critical applications, such as medical or military systems. Hence, it is necessary to develop methods of counteracting these attacks. These methods are called defense strategies and aim at increasing the neural model’s robustness against adversarial attacks. In this paper, we reviewed the recent findings in adversarial attacks and defense strategies. We also analyzed the effects of attacks and defense strategies applied, using the local and global analyzing methods from the family of explainable artificial intelligence.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press, Cambridge (2016). ISBN: 0262035618
MATH Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.: ImageNet classification with deep convolutional neural networks. In: Neural Information Processing Systems, vol. 25, pp. 1097–1105 (2012)
Google Scholar
Kukačka, J., Golkov, V., Cremers, D.: Regularization for Deep Learning: a Taxonomy (2017). arXiv:1710.10686
Grochowski, M., Kwasigroch, A., Mikołajczyk, A.: Selected technical issues of deep neural networks for image classification purposes. Bull. Pol. Acad. Sci. Tech. Sci. 67(2) (2019)
Google Scholar
Mikołajczyk, A., Grochowski, M.: Data augmentation for improving deep learning in image classification problem. In: International Interdisciplinary PhD Workshop (IIPhDW), pp. 117–122 (2018)
Google Scholar
Mikołajczyk, A., Grochowski, M.: Style transfer-based image synthesis as an efficient regularization technique in deep learning. In: 24th International Methods and Models in Automation and Robotics (MMAR), pp. 42–47 (2019)
Google Scholar
Elsken, T., Metzen, J.H., Hutter, F.: Neural Architecture Search: A Survey (2019). arXiv:1808.05377
Kwasigroch, A., Grochowski, M., Mikołajczyk, A.: Neural architecture search for skin lesion classification. IEEE Access 8, 9061–9071 (2020)
Article Google Scholar
Mikołajczyk, A., Grochowski, M., Kwasigroch, A.: Towards explainable classifiers using the counterfactual approach - global explanations for discovering bias in data. J. Artif. Intell. Soft Comput. Res. (in press)
Google Scholar
Yuan, X., He, P., Li, X., Zhu, Q.: Adversarial examples: attacks and defenses for deep learning. IEEE Trans. Neural Netw. Learn. Syst. 30, 2805–2824 (2019)
Article MathSciNet Google Scholar
Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I., Fergus, R.: Intriguing properties of neural networks. In: 2nd International Conference on Learning Representations (2014)
Google Scholar
Nguyen, A., Clune, J., Yosinski, J.: Deep neural networks are easily fooled: high confidence predictions. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 427–436 (2015)
Google Scholar
Eykholt, K., et al.: Robust physical-world attacks on deep learning visual classification. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, pp. 1625–1634 (2018)
Google Scholar
Cisse, M., Adi, Y., Keshet, J., Neverova, N.: Houdini: Fooling Deep Structured Prediction Models (2017). arXiv:1707.05373
Finlayson, S.G., Chung, H.W., Beam, A., Kohane I.S.: Adversarial attacks against medical deep learning systems (2018). arXiv:1804.05296
Feinman, R., Curtin, R., Gardner, A., Shintre, S.: Detecting Adversarial Samples from Artifacts (2017). arXiv:1703.00410
Rigazio, L., Gu, S.: Towards Deep Neural Network Architectures Robust to Adversarial Examples (2014). arXiv:1412.5068
Xu, H., et al.: Adversarial Attacks and Defenses in Images, Graphs and Text: A Review (2019). arXiv:1909.08072
Papernot, N., McDaniel, P., Wu, X., Swami, A., Jha, S.: Distillation as a defense to adversarial perturbations against deep neural networks. In: IEEE Symposium on Security and Privacy, pp. 582–597 (2016)
Google Scholar
Goodfellow, I., McDaniel, P., Papernot, N.: Making machine learning robust against adversarial inputs. Commun. ACM Assoc. Comput. Mach. 61, 56–66 (2018)
Google Scholar
Fidel, G., Bitton R., Shabtai, A.: When Explainability Meets Adversarial Learning: Detecting Adversarial Examples using SHAP Signatures (2019). arXiv:1909.03418
Binder, A., Samek, W., Montavon, G., Lapuschkin, S., Müller, K.R.: Analyzing and validating neural networks predictions. In: ICML’16 Workshop on Visualization for Deep Learning (2016)
Google Scholar
Ribeiro, M.T., Singh, S., Guestrin, C.: “Why Should I Trust You?”: Explaining the Predictions of Any Classifier (2016). arXiv:1602.04938
Ribeiro, M.T., Singh, S., Guestrin, C.: Anchors: high-precision model-agnostic explanations. In: 32nd AAAI Conference on Artificial Intelligence (2018)
Google Scholar
Smilkov, D., Thorat, N., Kim, B., Wattenberg, M., Viégas, F.: SmoothGrad: removing noise by adding noise (2017). arXiv:1706.03825
Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8689, pp. 818–833. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10590-1_53
Chapter Google Scholar
Moeys, D.P., et al.: Steering a predator robot using a mixed frame/event-driven convolutional neural network. In: Second International Conference on Event-based Control, Communication and Signal Processing (EBCCSP), Krakow, pp. 1–8 (2016)
Google Scholar
Ancona, M., Ceolini, E., Gross, M., Öztireli, C.: A unified view of gradient-based attribution methods for Deep Neural Networks. In: NIPS 2017-Workshop on Interpreting, Explaining and Visualizing Deep Learning, ETH, Zurich (2017)
Google Scholar
Binder, A., Montavon, G., Lapuschkin, S., Müller, K.-R., Samek, W.: Layer-wise relevance propagation for neural networks with local renormalization layers. In: Villa, A.E.P., Masulli, P., Pons Rivero, A.J. (eds.) ICANN 2016. LNCS, vol. 9887, pp. 63–71. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-44781-0_8
Chapter MATH Google Scholar
Kaggle: Dogs & Cats Images (2020). https://kaggle.com/chetankv/dogs-cats-images
Lapuschkin, S., Wäldchen, S., Binder, A., Montavon, G., Samek, W., Müller, K.R.: Unmasking clever hans predictors and assessing what machines really learn. Nat. Commun. 10(1), 1096 (2019)
Article Google Scholar
Fidel, G., Bitton, R., Shabtai, A.: When Explainability Meets Adversarial Learning: Detecting Adversarial Examples using SHAP Signatures (2019). arXiv:1909.03418
McInnes, L.: Umap (2020). https://github.com/lmcinnes/umap
Alber, M.: Innvestigate (2020). https://github.com/albermax/innvestigate
FGSM-Keras, GitHub. https://github.com/soumyac1999/FGSM-Keras
GitHub. https://github.com/Hyperparticle/one-pixel-attack-keras

Download references

Author information

Authors and Affiliations

Gdańsk University of Technology, Gabriela Narutowicza 11/12, 80-233, Gdańsk, Poland
Zuzanna Klawikowska, Agnieszka Mikołajczyk & Michał Grochowski

Authors

Zuzanna Klawikowska
View author publications
You can also search for this author in PubMed Google Scholar
Agnieszka Mikołajczyk
View author publications
You can also search for this author in PubMed Google Scholar
Michał Grochowski
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Michał Grochowski .

Editor information

Editors and Affiliations

Częstochowa University of Technology, Częstochowa, Poland
Leszek Rutkowski
Częstochowa University of Technology, Częstochowa, Poland
Rafał Scherer
Częstochowa University of Technology, Częstochowa, Poland
Marcin Korytkowski
Electrical and Computer Engineering, University of Alberta, Edmonton, AB, Canada
Witold Pedrycz
AGH University of Science and Technology, Kraków, Poland
Ryszard Tadeusiewicz
Electrical and Computer Engineering, University of Louisville, Louisville, KY, USA
Jacek M. Zurada

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Klawikowska, Z., Mikołajczyk, A., Grochowski, M. (2020). Explainable AI for Inspecting Adversarial Attacks on Deep Neural Networks. In: Rutkowski, L., Scherer, R., Korytkowski, M., Pedrycz, W., Tadeusiewicz, R., Zurada, J.M. (eds) Artificial Intelligence and Soft Computing. ICAISC 2020. Lecture Notes in Computer Science(), vol 12415. Springer, Cham. https://doi.org/10.1007/978-3-030-61401-0_14

Download citation

DOI: https://doi.org/10.1007/978-3-030-61401-0_14
Published: 07 October 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-61400-3
Online ISBN: 978-3-030-61401-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics