Abstract
Deep Neural Networks (DNNs) are suffering from adversarial attacks, where some imperceptible perturbations are added into examples and cause incorrect predictions. Generally, there are two types of adversarial attack methods, i.e., image-dependent and image agnostic. As for the first one, Image-dependent attacks involve crafting unique adversarial perturbations for each clean example. As for the latter case, image-agnostic attacks create a universal adversarial perturbation (UAP) that can fool the target model for all clean examples. However, existing UAP methods only utilize the output of the target DNNs within a limited magnitude, resulting in an ineffective application of UAP to the entire feature extraction process of the DNNs. In this paper, we consider the difference between the mid-level features of the clean example and their corresponding adversarial example in the different intermediate layers of target DNN. Specifically, we maximize the impact of the adversarial examples in the forward propagation process by pulling apart the feature representations of the clean and adversarial examples. Moreover, to achieve targeted and non-targeted attacks, we design a loss function that highlights the UAP feature representation to guide the direction of perturbations in the feature layers. Furthermore, to reduce the training time and training parameters, we adopt a direct optimization approach to craft UAPs and experimentally demonstrate that we can achieve a higher fooling rate with fewer examples. Extensive experimental results show that our approach outperforms state-of-the-art methods in both non-targeted and targeted universal attacks.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Carlini, N., Wagner, D.: Towards evaluating the robustness of neural networks. In: S &P (2017)
Chen, K., Guo, S., Zhang, T., Li, S., Liu, Y.: Temporal watermarks for deep reinforcement learning models. In: Dignum, F., Lomuscio, A., Endriss, U., Nowé, A. (eds.) AAMAS, pp. 314–322 (2021)
Chen, K., et al.: Badpre: task-agnostic backdoor attacks to pre-trained NLP foundation models. In: ICLR (2022)
Dai, J., Shu, L.: Fast-UAP: an algorithm for expediting universal adversarial perturbation generation using the orientations of perturbation vectors. Neurocomputing 422, 109–117 (2021)
Deng, J., Dong, W., Socher, R., Li, L., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2009), 20–25 June 2009, Miami, Florida, USA (2009)
Dolatabadi, H.M., Erfani, S., Leckie, C.: Advflow: inconspicuous black-box adversarial attacks using normalizing flows. In: NIPS (2020)
Dong, Y., Liao, F., Pang, T., Su, H., Zhu, J., Hu, X., Li, J.: Boosting adversarial attacks with momentum. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (2018)
Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. In: ICLR (2015)
Hayes, J., Danezis, G.: Learning universal adversarial perturbations with generative models. In: SP (2018)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, June 27–30, 2016 (2016)
He, S., et al.: Type-i generative adversarial attack. IEEE Trans. Dependable Secure Comput. 20(3), 2593–2606 (2023)
Huang, G., Liu, Z., van der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, July 21–26, 2017 (2017)
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: ICLR (2015)
Krizhevsky, A., Hinton, G., et al.: Learning multiple layers of features from tiny images. Handbook of Systemic Autoimmune Diseases (2009)
Li, G., Ding, S., Luo, J., Liu, C.: Enhancing intrinsic adversarial robustness via feature pyramid decoder. In: CVPR, pp. 797–805. Computer Vision Foundation/IEEE (2020)
Li, G., Xu, G., Qiu, H., He, R., Li, J., Zhang, T.: Improving adversarial robustness of 3D point cloud classification models. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) ECCV2022. LNCS, vol. 13664, pp. 672–689. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-19772-7_39
Madry, A., Makelov, A., Schmidt, L., Tsipras, D., Vladu, A.: Towards deep learning models resistant to adversarial attacks. In: ICLR (2017)
Mangla, P., Jandial, S., Varshney, S., Balasubramanian, V.N.: Advgan++ : Hrnessing latent layers for adversary generation. In: ICCV (2019)
Moosavi-Dezfooli, S., Fawzi, A., Fawzi, O., Frossard, P.: Universal adversarial perturbations. In: CVPR (2017)
Moosavi-Dezfooli, S., Fawzi, A., Frossard, P.: Deepfool: a simple and accurate method to fool deep neural networks. In: CVPR (2016)
Mopuri, K.R., Ganeshan, A., Babu, R.V.: Generalizable data-free objective for crafting universal adversarial perturbations. IEEE Trans. Pattern Anal. Mach. Intell. 41, 2452–2465 (2019)
Mopuri, K.R., Garg, U., Radhakrishnan, V.B.: Fast feature fool: a data independent approach to universal adversarial perturbations. In: BMVC (2017)
Mopuri, K.R., Ojha, U., Garg, U., Babu, R.V.: NAG: network for adversary generation. In: CVPR (2018)
Mopuri, K.R., Uppala, P.K., Babu, R.V.: Ask, acquire, and attack: data-free UAP generation using class impressions. In: ECCV (2018)
Peng, W., et al.: EnsembleFool: a method to generate adversarial examples based on model fusion strategy. Comput. Secur. 107, 102317 (2021)
Poursaeed, O., Katsman, I., Gao, B., Belongie, S.J.: Generative adversarial perturbations. In: CVPR (2018)
Ren, M., Zhu, Y., Wang, Y., Sun, Z.: Perturbation inactivation based adversarial defense for face recognition. IEEE Trans. Inf. Forensics Secur. 17, 2947–2962 (2022)
Sharif, A., Marijan, D.: Adversarial deep reinforcement learning for improving the robustness of multi-agent autonomous driving policies. In: 29th Asia-Pacific Software Engineering Conference, APSEC 2022, Virtual Event, Japan, December 6–9, 2022 (2022)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7–9, 2015, Conference Track Proceedings (2015)
Sun, M., Tang, F., Yi, J., Wang, F., Zhou, J.: Identify susceptible locations in medical records via adversarial attacks on deep predictive models. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD 2018, London, UK, August 19–23, 2018 (2018)
Szegedy, C., et al.: Going deeper with convolutions. In: CVPR (2015)
Szegedy, C., et al.: Intriguing properties of neural networks. In: ICLR (2014)
Tu, C.C., et al.: Autozoom: autoencoder-based zeroth order optimization method for attacking black-box neural networks. In: Proceedings of the AAAI Conference on Artificial Intelligence (2019)
Xiao, C., Li, B., Zhu, J.Y., He, W., Liu, M., Song, D.: Generating adversarial examples with adversarial networks. CoRR (2018)
Zhang, C., Benz, P., Imtiaz, T., Kweon, I.S.: Understanding adversarial examples from the mutual influence of images and perturbations. In: CVPR (2020)
Zhang, C., Benz, P., Karjauv, A., Kweon, I.S.: Data-free universal adversarial perturbation and black-box attack. In: ICCV (2021)
Zhang, Y., Ruan, W., Wang, F., Huang, X.: Generalizing universal adversarial attacks beyond additive perturbations. In: Plant, C., Wang, H., Cuzzocrea, A., Zaniolo, C., Wu, X. (eds.) 20th IEEE International Conference on Data Mining, ICDM 2020, Sorrento, Italy, November 17–20, 2020 (2020)
Acknowledgment
This work was supported in part by the National Natural Science Foundation of China under Grant 62162067 and 62101480, in part by the Fund Project of Yunnan Province Education Department under Grant No.2022j0008, in part by the Yunnan Province Science Foundation under Grant No.202005AC160007, No.202001BB050076, Research and Application of Object detection based on Artificial Intelligence.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Wu, H., Li, H., Zhang, J., Zhou, W., Guo, L., Dong, Y. (2023). Multi-scale Features Destructive Universal Adversarial Perturbations. In: Wang, D., Yung, M., Liu, Z., Chen, X. (eds) Information and Communications Security. ICICS 2023. Lecture Notes in Computer Science, vol 14252. Springer, Singapore. https://doi.org/10.1007/978-981-99-7356-9_25
Download citation
DOI: https://doi.org/10.1007/978-981-99-7356-9_25
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-7355-2
Online ISBN: 978-981-99-7356-9
eBook Packages: Computer ScienceComputer Science (R0)