Multi-scale Features Destructive Universal Adversarial Perturbations

Wu, Huangxinyue; Li, Haoran; Zhang, Jinhong; Zhou, Wei; Guo, Lei; Dong, Yunyun

doi:10.1007/978-981-99-7356-9_25

Huangxinyue Wu¹¹,
Haoran Li¹¹,
Jinhong Zhang¹¹,
Wei Zhou¹²,
Lei Guo¹³ &
…
Yunyun Dong¹⁴

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14252))

Included in the following conference series:

International Conference on Information and Communications Security

673 Accesses

Abstract

Deep Neural Networks (DNNs) are suffering from adversarial attacks, where some imperceptible perturbations are added into examples and cause incorrect predictions. Generally, there are two types of adversarial attack methods, i.e., image-dependent and image agnostic. As for the first one, Image-dependent attacks involve crafting unique adversarial perturbations for each clean example. As for the latter case, image-agnostic attacks create a universal adversarial perturbation (UAP) that can fool the target model for all clean examples. However, existing UAP methods only utilize the output of the target DNNs within a limited magnitude, resulting in an ineffective application of UAP to the entire feature extraction process of the DNNs. In this paper, we consider the difference between the mid-level features of the clean example and their corresponding adversarial example in the different intermediate layers of target DNN. Specifically, we maximize the impact of the adversarial examples in the forward propagation process by pulling apart the feature representations of the clean and adversarial examples. Moreover, to achieve targeted and non-targeted attacks, we design a loss function that highlights the UAP feature representation to guide the direction of perturbations in the feature layers. Furthermore, to reduce the training time and training parameters, we adopt a direct optimization approach to craft UAPs and experimentally demonstrate that we can achieve a higher fooling rate with fewer examples. Extensive experimental results show that our approach outperforms state-of-the-art methods in both non-targeted and targeted universal attacks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Carlini, N., Wagner, D.: Towards evaluating the robustness of neural networks. In: S &P (2017)
Google Scholar
Chen, K., Guo, S., Zhang, T., Li, S., Liu, Y.: Temporal watermarks for deep reinforcement learning models. In: Dignum, F., Lomuscio, A., Endriss, U., Nowé, A. (eds.) AAMAS, pp. 314–322 (2021)
Google Scholar
Chen, K., et al.: Badpre: task-agnostic backdoor attacks to pre-trained NLP foundation models. In: ICLR (2022)
Google Scholar
Dai, J., Shu, L.: Fast-UAP: an algorithm for expediting universal adversarial perturbation generation using the orientations of perturbation vectors. Neurocomputing 422, 109–117 (2021)
Article Google Scholar
Deng, J., Dong, W., Socher, R., Li, L., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2009), 20–25 June 2009, Miami, Florida, USA (2009)
Google Scholar
Dolatabadi, H.M., Erfani, S., Leckie, C.: Advflow: inconspicuous black-box adversarial attacks using normalizing flows. In: NIPS (2020)
Google Scholar
Dong, Y., Liao, F., Pang, T., Su, H., Zhu, J., Hu, X., Li, J.: Boosting adversarial attacks with momentum. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (2018)
Google Scholar
Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. In: ICLR (2015)
Google Scholar
Hayes, J., Danezis, G.: Learning universal adversarial perturbations with generative models. In: SP (2018)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, June 27–30, 2016 (2016)
Google Scholar
He, S., et al.: Type-i generative adversarial attack. IEEE Trans. Dependable Secure Comput. 20(3), 2593–2606 (2023)
Article Google Scholar
Huang, G., Liu, Z., van der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, July 21–26, 2017 (2017)
Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: ICLR (2015)
Google Scholar
Krizhevsky, A., Hinton, G., et al.: Learning multiple layers of features from tiny images. Handbook of Systemic Autoimmune Diseases (2009)
Google Scholar
Li, G., Ding, S., Luo, J., Liu, C.: Enhancing intrinsic adversarial robustness via feature pyramid decoder. In: CVPR, pp. 797–805. Computer Vision Foundation/IEEE (2020)
Google Scholar
Li, G., Xu, G., Qiu, H., He, R., Li, J., Zhang, T.: Improving adversarial robustness of 3D point cloud classification models. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) ECCV2022. LNCS, vol. 13664, pp. 672–689. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-19772-7_39
Chapter Google Scholar
Madry, A., Makelov, A., Schmidt, L., Tsipras, D., Vladu, A.: Towards deep learning models resistant to adversarial attacks. In: ICLR (2017)
Google Scholar
Mangla, P., Jandial, S., Varshney, S., Balasubramanian, V.N.: Advgan++ : Hrnessing latent layers for adversary generation. In: ICCV (2019)
Google Scholar
Moosavi-Dezfooli, S., Fawzi, A., Fawzi, O., Frossard, P.: Universal adversarial perturbations. In: CVPR (2017)
Google Scholar
Moosavi-Dezfooli, S., Fawzi, A., Frossard, P.: Deepfool: a simple and accurate method to fool deep neural networks. In: CVPR (2016)
Google Scholar
Mopuri, K.R., Ganeshan, A., Babu, R.V.: Generalizable data-free objective for crafting universal adversarial perturbations. IEEE Trans. Pattern Anal. Mach. Intell. 41, 2452–2465 (2019)
Article Google Scholar
Mopuri, K.R., Garg, U., Radhakrishnan, V.B.: Fast feature fool: a data independent approach to universal adversarial perturbations. In: BMVC (2017)
Google Scholar
Mopuri, K.R., Ojha, U., Garg, U., Babu, R.V.: NAG: network for adversary generation. In: CVPR (2018)
Google Scholar
Mopuri, K.R., Uppala, P.K., Babu, R.V.: Ask, acquire, and attack: data-free UAP generation using class impressions. In: ECCV (2018)
Google Scholar
Peng, W., et al.: EnsembleFool: a method to generate adversarial examples based on model fusion strategy. Comput. Secur. 107, 102317 (2021)
Article Google Scholar
Poursaeed, O., Katsman, I., Gao, B., Belongie, S.J.: Generative adversarial perturbations. In: CVPR (2018)
Google Scholar
Ren, M., Zhu, Y., Wang, Y., Sun, Z.: Perturbation inactivation based adversarial defense for face recognition. IEEE Trans. Inf. Forensics Secur. 17, 2947–2962 (2022)
Article Google Scholar
Sharif, A., Marijan, D.: Adversarial deep reinforcement learning for improving the robustness of multi-agent autonomous driving policies. In: 29th Asia-Pacific Software Engineering Conference, APSEC 2022, Virtual Event, Japan, December 6–9, 2022 (2022)
Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7–9, 2015, Conference Track Proceedings (2015)
Google Scholar
Sun, M., Tang, F., Yi, J., Wang, F., Zhou, J.: Identify susceptible locations in medical records via adversarial attacks on deep predictive models. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD 2018, London, UK, August 19–23, 2018 (2018)
Google Scholar
Szegedy, C., et al.: Going deeper with convolutions. In: CVPR (2015)
Google Scholar
Szegedy, C., et al.: Intriguing properties of neural networks. In: ICLR (2014)
Google Scholar
Tu, C.C., et al.: Autozoom: autoencoder-based zeroth order optimization method for attacking black-box neural networks. In: Proceedings of the AAAI Conference on Artificial Intelligence (2019)
Google Scholar
Xiao, C., Li, B., Zhu, J.Y., He, W., Liu, M., Song, D.: Generating adversarial examples with adversarial networks. CoRR (2018)
Google Scholar
Zhang, C., Benz, P., Imtiaz, T., Kweon, I.S.: Understanding adversarial examples from the mutual influence of images and perturbations. In: CVPR (2020)
Google Scholar
Zhang, C., Benz, P., Karjauv, A., Kweon, I.S.: Data-free universal adversarial perturbation and black-box attack. In: ICCV (2021)
Google Scholar
Zhang, Y., Ruan, W., Wang, F., Huang, X.: Generalizing universal adversarial attacks beyond additive perturbations. In: Plant, C., Wang, H., Cuzzocrea, A., Zaniolo, C., Wu, X. (eds.) 20th IEEE International Conference on Data Mining, ICDM 2020, Sorrento, Italy, November 17–20, 2020 (2020)
Google Scholar

Download references

Acknowledgment

This work was supported in part by the National Natural Science Foundation of China under Grant 62162067 and 62101480, in part by the Fund Project of Yunnan Province Education Department under Grant No.2022j0008, in part by the Yunnan Province Science Foundation under Grant No.202005AC160007, No.202001BB050076, Research and Application of Object detection based on Artificial Intelligence.

Author information

Authors and Affiliations

Engineering Research Center of Cyberspace, Yunnan University, Kunming, China
Huangxinyue Wu, Haoran Li & Jinhong Zhang
National Pilot School of Software, Engineering Research Center of Cyberspace, Yunnan University, Kunming, China
Wei Zhou
Yunnan University, Kunming, China
Lei Guo
National Pilot School of Software, School of Information Science and Engineering, Yunnan University, Kunming, China
Yunyun Dong

Authors

Huangxinyue Wu
View author publications
You can also search for this author in PubMed Google Scholar
Haoran Li
View author publications
You can also search for this author in PubMed Google Scholar
Jinhong Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Wei Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Lei Guo
View author publications
You can also search for this author in PubMed Google Scholar
Yunyun Dong
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yunyun Dong .

Editor information

Editors and Affiliations

Nankai University, Tianjin, China
Ding Wang
Columbia University, New York, NY, USA
Moti Yung
Nankai University, Tianjin, China
Zheli Liu
Xidian University, Xi’an, China
Xiaofeng Chen

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wu, H., Li, H., Zhang, J., Zhou, W., Guo, L., Dong, Y. (2023). Multi-scale Features Destructive Universal Adversarial Perturbations. In: Wang, D., Yung, M., Liu, Z., Chen, X. (eds) Information and Communications Security. ICICS 2023. Lecture Notes in Computer Science, vol 14252. Springer, Singapore. https://doi.org/10.1007/978-981-99-7356-9_25

Download citation

DOI: https://doi.org/10.1007/978-981-99-7356-9_25
Published: 20 October 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-7355-2
Online ISBN: 978-981-99-7356-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics