Skip to main content
Log in

Adversarial example generation for object detection using a data augmentation framework and momentum

  • Original Paper
  • Published:
Signal, Image and Video Processing Aims and scope Submit manuscript

Abstract

Object detection plays a vital role in numerous fields, including unmanned vehicles, security, and industry, among others. However, recent studies have shown that object detection models based on deep learning are vulnerable to adversarial example attacks. This poses significant challenges to the robustness of the models and substantially limits the applicability of object detection in critical security scenarios. Many existing adversarial attack methods for object detection focus primarily on specific types of object detection models. However, a widespread issue persists regarding weak transferability across different models. Furthermore, there is considerable room for enhancing the success rate of these attacks. To tackle these challenges, this paper proposes a data augmentation framework that leverages on the similarity between adversarial example generation and neural network training. Instead of utilizing the current gradient directly during the iterative process of calculating the gradient, the proposed method employs a weighted average gradient obtained by combining multiple data augmentation methods randomly. Additionally, the method combines the data-augmented gradients with momentum to obtain the gradients applicable to the adversarial attack in order to stabilize the update direction and avoid overfitting to the white-box model. The performance of the proposed method in attacking object detection models is evaluated on the MS COCO dataset using Faster R-CNN, YOLOv3 and YOLOv8. The experimental results demonstrate that, when compared to Projective Gradient Descend method and Iterative Fast Gradient Sign Method, the proposed method outperforms in both white-box and black-box settings. The transfer success rate achieved is up to 81.9% on RetinaNet.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Data availability

The data that support the findings of this study are available on request from the correspondence author.

References

  1. Fang, W., Shen, L., Chen, Y.: Survey on image object detection algorithms based on deep learning. In: Artificial Intelligence and Security: 7th International Conference, ICAIS 2021, Dublin, Ireland, July 19–23, 2021, Proceedings, Part I 7, pp. 468–480. Springer (2021)

  2. Arnold, E., et al.: A survey on 3d object detection methods for autonomous driving applications. IEEE Trans. Intell. Transp. Syst. 20(10), 3782–3795 (2019)

    Article  Google Scholar 

  3. Shen, M., et al.: Effective and robust physical-world attacks on deep learning face recognition systems. IEEE Trans. Inf. Forensics Secur. 16, 4063–4077 (2021)

    Article  Google Scholar 

  4. Mishra, P.K., Saroha, G.P.: A study on video surveillance system for object detection and tracking. In: 2016 3rd International Conference on Computing for Sustainable Global Development (INDIACom), pp. 221–226. IEEE (2016)

  5. Kim, I.S., et al.: Intelligent visual surveillance—a survey. Int. J. Control. Autom. Syst. 8, 926–939 (2010)

    Article  Google Scholar 

  6. Kim, H., et al.: Autonomous exploration in a cluttered environment for a mobile robot with 2d-map segmentation and object detection. IEEE Robot. Autom. Lett. 7(3), 6343–6350 (2022)

    Article  Google Scholar 

  7. Li, Z., et al.: A mobile robotic arm grasping system with autonomous navigation and object detection. In: 2021 International Conference on Control, Automation and Information Sciences (ICCAIS), pp. 543–548. IEEE (2021)

  8. Qiu, S., et al.: Review of artificial intelligence adversarial attack and defense technologies. Appl. Sci. 9(5), 909 (2019)

    Article  Google Scholar 

  9. Szegedy, C., et al.: Intriguing properties of neural networks. arXiv: 1312.6199 (2013)

  10. Akhtar, N., et al.: Threat of adversarial attacks on deep learning in computer vision: survey II. https://doi.org/10.48550/arXiv.2108.00401 (2021)

  11. Szegedy, C., et al.: Intriguing properties of neural networks. arXiv:1312.6199 (2013)

  12. Goodfellow, I. J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. arXiv:1412.6572 (2014)

  13. Madry, A., et al.: Towards deep learning models resistant to adversarial attacks. arXiv:1706.06083 (2017)

  14. Dong, Y., et al.: Boosting adversarial attacks with momentum. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 9185–9193 (2018)

  15. Mane, S., Mangale, S.: Moving object detection and tracking using convolutional neural networks. In: 2018 Second International Conference on Intelligent Computing and Control Systems (ICICCS), pp. 1809–1813. IEEE (2018)

  16. Papernot, N., McDaniel, P., Goodfellow, I.: Transferability in machine learning: from phenomena to black-box attacks using adversarial samples. arXiv:1605.07277 (2016)

  17. Chang, X., Zhang, W., Qian, Y., et al.: End-to-end multi-speaker speech recognition with transformer. In: ICASSP 2020–2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6134–6138. IEEE (2020)

  18. Ali, A.M., Ghaleb, F.A., Mohammed, M.S., et al.: Web-informed-augmented fake news detection model using stacked layers of convolutional neural network and deep autoencoder. Mathematics 11(9), 1992 (2023)

    Article  Google Scholar 

  19. Hafeezallah, A., Al-Dhamari, A., Abu-Bakar, S.A.R.: Visual motion segmentation in crowd videos based on spatial-angular stacked sparse autoencoders. Comput. Syst. Sci. Eng. 47(1), 593–611 (2023)

    Article  Google Scholar 

  20. Mohammed, M.S., Al-Dhamari, A., Saeed, W., et al.: Motion pattern-based scene classification using adaptive synthetic oversampling and fully connected deep neural network. IEEE Access 11, 119659–119675 (2023)

    Article  Google Scholar 

  21. Zou, Z., et al.: Object detection in 20 years: a survey. In: Proceedings of the IEEE (2023)

  22. Girshick, R., et al.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 580–587 (2014)

  23. Ren, S., et al.: Faster r-cnn: Towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, pp. 28 (2015)

  24. Liu, W., et al.: Ssd: single shot multibox detector. In: Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part I 14. Springer, pp. 21–37 (2016)

  25. Redmon, J., Farhadi, A.: YOLO9000: better, faster, stronger. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7263–7271 (2017)

  26. Redmon, J., Farhadi, A.: YOLOv3: An Incremental Improvement. arXiv e-prints (2018)

  27. Redmon, J., et al.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)

  28. Lin, T.Y., et al.: Focal loss for dense object detection. IEEE (2017). https://doi.org/10.1109/ICCV.2017.324

    Article  Google Scholar 

  29. Kurakin, A., Goodfellow, I., Bengio, S.: Adversarial examples in the physical world (2016)

  30. Xie, C., et al.: Improving Transferability of Adversarial Examples with Input Diversity (2018)

  31. Dong, Y., et al.: Evading defenses to transferable adversarial examples by translation-invariant attacks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4312–4321 (2019)

  32. Lin, J., et al.: Nesterov accelerated gradient and scale invariance for adversarial attacks. arXiv:1908.06281 (2019)

  33. Lu, J., Sibai, H., Fabry, E.: Adversarial examples that fool detectors. arXiv:1712.02494 (2017)

  34. Ren, S., He, K., Girshick, R., et al.: Faster r-cnn: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, p. 28 (2015)

  35. Li, Y., et al.: Robust adversarial perturbation on deep proposal-based models. arXiv:1809.05962 (2018)

Download references

Acknowledgements

The authors thank the peer reviewers for their careful reading and constructive comments.

Author information

Authors and Affiliations

Authors

Contributions

D. contributed to the writing of the main manuscript text, S. providing insightful analysis and interpretation of the research findings. M.D.X. played a key role in preparing figures 1-5, ensuring their accuracy and visual clarity. Additionally, all authors actively participated in the review process, offering valuable feedback and suggestions to enhance the overall quality of the manuscript.

Corresponding author

Correspondence to Zhiyi Ding.

Ethics declarations

Conflict of interest

The authors declared no potential conflicts of interest with respect to the research, authorship, and publication of this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ding, Z., Sun, L., Mao, X. et al. Adversarial example generation for object detection using a data augmentation framework and momentum. SIViP 18, 2485–2497 (2024). https://doi.org/10.1007/s11760-023-02924-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11760-023-02924-1

Keywords

Navigation