Towards sustainable adversarial training with successive perturbation generation

Lin, Wei; Liao, Lichuan

doi:10.1631/FITEE.2300474

Towards sustainable adversarial training with successive perturbation generation

基于连续扰动生成方法的可持续对抗训练

Research Article
Published: 10 May 2024

Volume 25, pages 527–539, (2024)
Cite this article

Frontiers of Information Technology & Electronic Engineering Aims and scope Submit manuscript

10 Accesses
Explore all metrics

Abstract

Adversarial training with online-generated adversarial examples has achieved promising performance in defending adversarial attacks and improving robustness of convolutional neural network models. However, most existing adversarial training methods are dedicated to finding strong adversarial examples for forcing the model to learn the adversarial data distribution, which inevitably imposes a large computational overhead and results in a decrease in the generalization performance on clean data. In this paper, we show that progressively enhancing the adversarial strength of adversarial examples across training epochs can effectively improve the model robustness, and appropriate model shifting can preserve the generalization performance of models in conjunction with negligible computational cost. To this end, we propose a successive perturbation generation scheme for adversarial training (SPGAT), which progressively strengthens the adversarial examples by adding the perturbations on adversarial examples transferred from the previous epoch and shifts models across the epochs to improve the efficiency of adversarial training. The proposed SPGAT is both efficient and effective; e.g., the computation time of our method is 900 min as against the 4100 min duration observed in the case of standard adversarial training, and the performance boost is more than 7% and 3% in terms of adversarial accuracy and clean accuracy, respectively. We extensively evaluate the SPGAT on various datasets, including small-scale MNIST, middle-scale CIFAR-10, and large-scale CIFAR-100. The experimental results show that our method is more efficient while performing favorably against state-of-the-art methods.

摘要

基于在线生成对抗性样本的对抗性训练在防御对抗性攻击和提高卷积神经网络(CNN)模型鲁棒性方面取得良好效果。然而, 大多数现有对抗训练方法都致力于寻找强对抗例子迫使模型学习对抗数据分布, 这不可避免地增加了大量计算开销并导致干净数据丢失。本文展示了在不同训练世代中渐进式地增强对抗样本本身的对抗强度能有效提高模型鲁棒性, 适当的模型转换可以保持模型泛化性能, 且这一转换过程的计算成本可忽略不计。因此, 本文提出一种针对对抗训练的连续扰动生成方法(SPGAT), 该方法通过在前一训练世代转移的对抗样本上添加扰动逐步增强对抗样本, 并跨世代转换模型以提高对抗训练效率。实验表明, 本文所提SPGAT方法既高效又有效; 例如, 所提方法计算时间为900分钟, 标准对抗训练持续时间为4100分钟, 对抗精度和干净样本精度性能提升分别超过7%和3%。在不同数据集上对SPGAT进行广泛评估, 包括小规模MNIST、中等规模CIFAR-10和大规模CIFAR-100。实验结果表明, 相比于目前最优方法, 所提方法更有效。

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Data availability

The data that support the findings of this study are available from the first author upon reasonable request.

References

Andriushchenko M, Flammarion N, 2020. Understanding and improving fast adversarial training. Proc 34^th Int Conf on Neural Information Processing Systems, Article 1346.
Athalye A, Carlini N, Wagner D, 2018. Obfuscated gradients give a false sense of security: circumventing defenses to adversarial examples. Proc 35^th Int Conf on Machine Learning, p.274–283.
Baluja S, Fischer I, 2018. Learning to attack: adversarial transformation networks. Proc 32^nd AAAI Conf on Artificial Intelligence, p.2687–2695. https://doi.org/10.1609/aaai.v32i1.11672
Buckman J, Roy A, Raffel C, et al., 2018. Thermometer encoding: one hot way to resist adversarial examples. Proc Int Conf on Learning Representations.
Cai QZ, Liu C, Song D, 2018. Curriculum adversarial training. Proc 27^th Int Joint Conf on Artificial Intelligence, p.3740–3747.
Carlini N, Katz G, Barrett C, et al., 2017. Provably minimally-distorted adversarial examples. https://arxiv.org/abs/1709.10207
Chen B, Yin JL, Chen SK, et al., 2023. An adaptive model ensemble adversarial attack for boosting adversarial transferability. http://export.arxiv.org/abs/2308.02897
Chen PY, Zhang H, Sharma Y, et al., 2017. ZOO: zeroth order optimization based black-box attacks to deep neural networks without training substitute models. Proc 10^th ACM Workshop on Artificial Intelligence and Security, p.15–26. https://doi.org/10.1145/3128572.3140448
Cheng YH, Lu F, Zhang XC, 2018. Appearance-based gaze estimation via evaluation-guided asymmetric regression. Proc 15^th European Conf on Computer Vision, p.105–121. https://doi.org/10.1007/978-3-030-01264-9_7
Croce F, Hein M, 2020. Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks. https://arxiv.org/abs/2003.01690v1
Ding KY, Liu XL, Niu WN, et al., 2021. A low-query black-box adversarial attack based on transferability. Knowl-Based Syst, 226:107102. https://doi.org/10.1016/j.knosys.2021.107102
Article Google Scholar
Doan BG, Abbasnejad E, Ranasinghe DC, 2020. Februus: input purification defense against Trojan attacks on deep neural network systems. Proc Annual Computer Security Applications Conf, p.897–912. https://doi.org/10.1145/3427228.3427264
Dong YP, Liao FZ, Pang TY, et al., 2018. Boosting adversarial attacks with momentum. Proc IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.9185–9193. https://doi.org/10.1109/CVPR.2018.00957
Eykholt K, Evtimov I, Fernandes E, et al., 2018. Robust physical-world attacks on deep learning visual classification. Proc IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.1625–1634. https://doi.org/10.1109/CVPR.2018.00175
Finlayson SG, Bowers JD, Ito J, et al., 2019. Adversarial attacks on medical machine learning. Science, 363(6433):1287–1289. https://doi.org/10.1126/science.aaw4399
Article Google Scholar
Goldblum M, Fowl L, Feizi S, et al., 2020. Adversarially robust distillation. Proc 34^th AAAI Conference on Artificial Intelligence, p.3996–4003. https://doi.org/10.1609/aaai.v34i04.5816
Goodfellow IJ, Shlens J, Szegedy C, 2015. Explaining and harnessing adversarial examples. Proc 3^rd Int Conf on Learning Representations.
Guo C, Rana M, Cisse M, et al., 2017. Countering adversarial images using input transformations. Proc 6^th Int Conf on Learning Representations.
Hu YY, Sun SL, 2021. RL-VAEGAN: adversarial defense for reinforcement learning agents via style transfer. Knowl-Based Syst, 221:106967. https://doi.org/10.1016/j.knosys.2021.106967
Article Google Scholar
Huang B, Wang Y, Wang W, 2019. Model-agnostic adversarial detection by random perturbations. Proc 28^th Int Joint Conf on Artificial Intelligence, p.4689–4696.
Izmailov P, Podoprikhin D, Garipov T, et al., 2018. Averaging weights leads to wider optima and better generalization. Proc 34^th Conf on Uncertainty in Artificial Intelligence, p.876–885.
Kariyappa S, Qureshi MK, 2019. Improving adversarial robustness of ensembles with diversity training. https://arxiv.org/abs/1901.09981
Krizhevsky A, Hinton G, 2009. Learning Multiple Layers of Features from Tiny Images. Technical Report, Computer Science Department, University of Toronto, Canada.
Google Scholar
Kurakin A, Goodfellow IJ, Bengio S, 2017. Adversarial examples in the physical world. Proc 5^th Int Conf on Learning Representations.
Lecun Y, Bottou L, Bengio Y, et al., 1998. Gradient-based learning applied to document recognition. Proc IEEE, 86(11):2278–2324. https://doi.org/10.1109/5.726791
Article Google Scholar
Li B, Wang SQ, Jana S, et al., 2021. Towards understanding fast adversarial training. https://arxiv.org/abs/2006.03089v1
Liu L, Du Y, Wang Y, et al., 2022. LRP2A: layer-wise relevance propagation based adversarial attacking for graph neural networks. Knowl-Based Syst, 256:109830.
Article Google Scholar
Madaan D, Shin J, Hwang SJ, 2020. Adversarial neural pruning with latent vulnerability suppression. Proc 37^th Int Conf on Machine Learning, Article 610.
Madry A, Makelov A, Schmidt L, et al., 2018. Towards deep learning models resistant to adversarial attacks. Proc 6^th Int Conf on Learning Representations.
Moosavi-Dezfooli SM, Fawzi A, Frossard P, 2016. DeepFool: a simple and accurate method to fool deep neural networks. Proc IEEE Conf on Computer Vision and Pattern Recognition, p.2574–2582. https://doi.org/10.1109/CVPR.2016.282
Pang TY, Xu K, Du C, et al., 2019. Improving adversarial robustness via promoting ensemble diversity. https://arxiv.org/abs/1901.08846
Papernot N, McDaniel P, Wu X, et al., 2016a. Distillation as a defense to adversarial perturbations against deep neural networks. Proc IEEE Symp on Security and Privacy, p.582–597. https://doi.org/10.1109/SP.2016.41
Papernot N, McDaniel P, Jha S, et al., 2016b. The limitations of deep learning in adversarial settings. Proc IEEE European Symp on Security and Privacy, p.372–387. https://doi.org/10.1109/EuroSP.2016.36
Shafahi A, Najibi M, Ghiasi A, et al., 2019. Adversarial training for free! Proc 33^rd Int Conf on Neural Information Processing Systems, Article 302.
Szegedy C, Zaremba W, Sutskever I, et al., 2014. Intriguing properties of neural networks. Proc Int Conf on Learning Representations.
Tramer F, Kurakin A, Papernot N, et al., 2018. Ensemble adversarial training: attacks and defenses. Proc 6^th Int Conf on Learning Representations.
Vivek BS, Babu RV, 2020. Single-step adversarial training with dropout scheduling. Proc IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.947–956. https://doi.org/10.1109/CVPR42600.2020.00103
Wang ZB, Guo HC, Zhang ZF, et al., 2021. Feature importance-aware transferable adversarial attacks. Proc IEEE/CVF Int Conf on Computer Vision, p.7639–7648. https://doi.org/10.1109/ICCV48922.2021.00754
Wei XX, Liang SY, Chen N, et al., 2019. Transferable adversarial attacks for image and video object detection. Proc 28^th Int Joint Conf on Artificial Intelligence, p.954–960.
Wong E, Rice L, Kolter JZ, 2020. Fast is better than free: revisiting adversarial training. Proc 8^th Int Conf on Learning Representations.
Yamamura K, Sato H, Tateiwa N, et al., 2022. Diversified adversarial attacks based on conjugate gradient method. Proc 39^th Int Conf on Machine Learning, p.24872–24894.
Yang HR, Zhang JY, Dong HL, et al., 2020. DVERGE: diversifying vulnerabilities for enhanced robust generation of ensembles. Proc 34^th Int Conf on Neural Information Processing Systems, Article 462.
Zhang JB, Qian WH, Nie RC, et al., 2022. LP-BFGS attack: an adversarial attack based on the Hessian with limited pixels. https://arxiv.org/abs/2210.15446
Zhang JF, Xu XL, Han B, et al., 2020. Attacks which do not kill training make adversarial learning stronger. Proc 27^th Int Conf on Machine Learning, Article 1046.
Zheng HZ, Zhang ZQ, Gu JC, et al., 2020. Efficient adversarial training with transferable adversarial examples. Proc IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.1178–1187. https://doi.org/10.1109/CVPR42600.2020.00126
Zhu C, Huang WR, Li HD, et al., 2019. Transferable clean-label poisoning attacks on deep neural nets. Proc 36^th Int Conf on Machine Learning, p.7614–7623.

Download references

Author information

Authors and Affiliations

College of Computer Science and Mathematics, Fujian University of Technology, Fuzhou, 350118, China
Wei Lin (林巍)
College of Economics and Management, Xi’an University of Technology, Xi’an, 710048, China
Lichuan Liao (廖丽娟)
Fujian Provincial Key Laboratory of Big Data Mining and Applications, Fujian University of Technology, Fuzhou, 350118, China
Wei Lin (林巍)

Authors

Wei Lin (林巍)
View author publications
You can also search for this author in PubMed Google Scholar
Lichuan Liao (廖丽娟)
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Wei LIN designed the research and drafted the paper. Wei LIN and Lichuan LIAO revised and finalized the paper.

Corresponding author

Correspondence to Lichuan Liao (廖丽娟).

Ethics declarations

Both authors declare that they have no conflict of interest.

Additional information

Project supported by the Scientific Research and Development Foundation of Fujian University of Technology, China (No. GYZ220209)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lin, W., Liao, L. Towards sustainable adversarial training with successive perturbation generation. Front Inform Technol Electron Eng 25, 527–539 (2024). https://doi.org/10.1631/FITEE.2300474

Download citation

Received: 12 July 2023
Accepted: 08 October 2023
Published: 10 May 2024
Issue Date: March 2024
DOI: https://doi.org/10.1631/FITEE.2300474

Key words

关键词

CLC number

TP391.1

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Towards sustainable adversarial training with successive perturbation generation

Abstract

摘要

Access this article

Data availability

References

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Key words

关键词

CLC number

Search

Navigation