Skip to main content

Towards Interpretable Defense Against Adversarial Attacks via Causal Inference

Abstract

Deep learning-based models are vulnerable to adversarial attacks. Defense against adversarial attacks is essential for sensitive and safety-critical scenarios. However, deep learning methods still lack effective and efficient defense mechanisms against adversarial attacks. Most of the existing methods are just stopgaps for specific adversarial samples. The main obstacle is that how adversarial samples fool the deep learning models is still unclear. The underlying working mechanism of adversarial samples has not been well explored, and it is the bottleneck of adversarial attack defense. In this paper, we build a causal model to interpret the generation and performance of adversarial samples. The self-attention/transformer is adopted as a powerful tool in this causal model. Compared to existing methods, causality enables us to analyze adversarial samples more naturally and intrinsically. Based on this causal model, the working mechanism of adversarial samples is revealed, and instructive analysis is provided. Then, we propose simple and effective adversarial sample detection and recognition methods according to the revealed working mechanism. The causal insights enable us to detect and recognize adversarial samples without any extra model or training. Extensive experiments are conducted to demonstrate the effectiveness of the proposed methods. Our methods outperform the state-of-the-art defense methods under various adversarial attacks.

This is a preview of subscription content, access via your institution.

References

  1. Y. LeCun, L. Bottou, Y. Bengio, P. Haffner. Gradient-based learning applied to document recognition. Proceedings of the IEEE, vol. 86, no. 11, pp. 2278–2324, 1998. DOI: https://doi.org/10.1109/5.726791.

    Article  Google Scholar 

  2. A. Krizhevsky, I. Sutskever, G. E. Hinton. ImageNet classification with deep convolutional neural networks. In Proceedings of the 26th Annual Conference on Neural Information Processing Systems, Lake Tahoe, USA, pp. 1106–1114, 2012.

  3. K. Simonyan, A Zisserman. Very deep convolutional networks for large-scale image recognition. In Proceedings of the 3rd International Conference on Learning Representations, San Diego, USA, 2015.

  4. C. Szegedy, W. Liu, Y. Q. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, A. Rabinovich. Going deeper with convolutions. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Boston, USA, pp. 1–9, 2015. DOI: https://doi.org/10.1109/CVPR.2015.7298594.

    Google Scholar 

  5. G. Huang, Z. Liu, L. Van Der Maaten, K. Q. Weinberger. Densely connected convolutional networks. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Honolulu, USA, pp. 2261–2269, 2017. DOI: https://doi.org/10.1109/CVPR.2017.243.

    Google Scholar 

  6. J. Hu, L. Shen, G. Sun. Squeeze-and-excitation networks. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 7132–7141, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00745.

    Google Scholar 

  7. K. M. He, X. Y. Zhang, S. Q. Ren, J. Sun. Deep residual learning for image recognition. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Las Vegas, USA, pp. 770–778, 2016. DOI: https://doi.org/10.1109/CVPR.2016.90.

    Google Scholar 

  8. R. Girshick. Fast R-CNN. In Proceedings of IEEE International Conference on Computer Vision, IEEE, Santiago, Chile, pp. 1440–1448, 2015. DOI: https://doi.org/10.1109/ICCV.2015.169.

    Google Scholar 

  9. J. Redmon, S. Divvala, R. Girshick, A. Farhadi. You only look once: Unified, real-time object detection. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Las Vegas, USA, pp. 779–788, 2016. DOI: https://doi.org/10.1109/CVPR.2016.91.

    Google Scholar 

  10. J. Long, E. Shelhamer, T. Darrell. Fully convolutional networks for semantic segmentation. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Boston, USA, pp. 3431–3440, 2015. DOI: https://doi.org/10.1109/CVPR.2015.7298965.

    Google Scholar 

  11. K. M. He, G. Gkioxari, P. Dollár, R. Girshick. Mask R-CNN. In Proceedings of IEEE International Conference on Computer Vision, IEEE, Venice, Italy, pp. 2980–2988, 2017. DOI: https://doi.org/10.1109/ICCV.2017.322.

    Google Scholar 

  12. C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. J. Goodfellow, R. Fergus. Intriguing properties of neural networks. In Proceedings of the 2nd International Conference on Learning Representations, Banff, Canada, 2014.

  13. I. J. Goodfellow, J. Shlens, C. Szegedy. Explaining and harnessing adversarial examples. In Proceedings of the 3rd International Conference on Learning Representations, San Diego, USA, 2015.

  14. S. M. Moosavi-Dezfooli, A. Fawzi, P. Frossard. DeepFool: A simple and accurate method to fool deep neural networks. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Las Vegas, USA, pp. 2574–2582, 2016. DOI: https://doi.org/10.1109/CVPR.2016.282.

    Google Scholar 

  15. F. Schroff, D. Kalenichenko, J. Philbin. FaceNet: A unified embedding for face recognition and clustering. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Boston, USA, pp. 815–823, 2015. DOI: https://doi.org/10.1109/CVPR.2015.7298682.

    Google Scholar 

  16. M. Ren, Y. L. Wang, Z. N. Sun, T. N. Tan. Dynamic graph representation for occlusion handling in biometrics. In Proceedings of the AAAI Conference on Artificial Intelligence, Palo Alto, USA, pp. 11940–11947, 2020. DOI: https://doi.org/10.1609/aaai.v34i07.6869.

  17. M. Ren, C. Y. Wang, Y. L. Wang, Z. N. Sun, T. N. Tan. Alignment free and distortion robust iris recognition. In Proceedings of International Conference on Biometrics, IEEE, Crete, Greece, 2019. DOI: https://doi.org/10.1109/ICB45273.2019.8987369.

    Book  Google Scholar 

  18. P. Sun, H. Kretzschmar, X. Dotiwalla, A. Chouard, V. Patnaik, P. Tsui, J. Guo, Y. Zhou, Y. N. Chai, B. Caine, V. Vasudevan, W. Han, J. Ngiam, H. Zhao, A. Timofeev, S. Ettinger, M. Krivokon, A. Gao, A. Joshi, Y. Zhang, J. Shlens, Z. F. Chen, D. Anguelov. Scalability in perception for autonomous driving: Waymo open dataset. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Seattle, USA, pp. 2443–2451, 2020. DOI: https://doi.org/10.1109/CVPR42600.2020.00252.

    Google Scholar 

  19. A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. H. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, J. Uszkoreit, N. Houlsby. An image is worth 16x16 words: Transformers for image recognition at scale. In Proceedings of the 9th International Conference on Learning Representations, 2021.

  20. N. Carlini, D. Wagner. Towards evaluating the robustness of neural networks. In Proceedings of IEEE Symposium on Security and Privacy, IEEE, San Jose, USA, pp. 39–57, 2017. DOI: https://doi.org/10.1109/SP.2017.49.

    Google Scholar 

  21. A. Madry, A. Makelov, L. Schmidt, D. Tsipras, A. Vladu. Towards deep learning models resistant to adversarial attacks. In Proceedings of the 6th International Conference on Learning Representations, Vancouver, Canada, 2018.

  22. S. M. Moosavi-Dezfooli, A. Fawzi, O. Fawzi, P. Frossard. Universal adversarial perturbations. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Honolulu, USA, pp. 86–94, 2017. DOI: https://doi.org/10.1109/CVPR.2017.17.

    Google Scholar 

  23. C. N. Zhang, P. Benz, A. Karjauv, I. S. Kweon. Data-free universal adversarial perturbation and black-box attack. In Proceedings of IEEE/CVF International Conference on Computer Vision, IEEE, Montreal, Canada, pp. 7848–7857, 2021. DOI: https://doi.org/10.1109/ICCV48922.2021.00777.

    Google Scholar 

  24. Z. B. Wang, H. C. Guo, Z. F. Zhang, W. X. Liu, Z. Qin, K. Ren. Feature importance-aware transferable adversarial attacks. In Proceedings of IEEE/CVF International Conference on Computer Vision, IEEE, Montreal, Canada, pp. 7619–7628, 2021. DOI: https://doi.org/10.1109/ICCV48922.2021.00754.

    Google Scholar 

  25. Z. Yuan, J. Zhang, Y. P. Jia, C. Q. Tan, T. Xue, S. G. Shan. Meta gradient adversarial attack. In Proceedings of IEEE/CVF International Conference on Computer Vision, IEEE, Montreal, Canada, pp. 7728–7737, 2021. DOI: https://doi.org/10.1109/ICCV48922.2021.00765.

    Google Scholar 

  26. J. W. Su, D. V. Vargas, K. Sakurai. One pixel attack for fooling deep neural networks. IEEE Transactions on Evolutionary Computation, vol. 23, no. 5, pp. 828–841, 2019. DOI: https://doi.org/10.1109/TEVC.2019.2890858.

    Article  Google Scholar 

  27. A. Athalye, L. Engstrom, A. Ilyas, K. Kwok. Synthesizing robust adversarial examples. In Proceedings of the 35th International Conference on Machine Learning, Stockholm, Sweden, pp. 284–293, 2018.

  28. T. B. Brown, D. Mané, A. Roy, M. Abadi, J. Gilmer. Adversarial patch. [Online], Available: https://arxiv.org/abs/1712.09665, 2017.

  29. K. Eykholt, I. Evtimov, E. Fernandes, B. Li, A. Rahmati, C. W. Xiao, A. Prakash, T. Kohno, D. Song. Robust physical-world attacks on deep learning visual classification. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 1625–1634, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00175.

    Google Scholar 

  30. M. Sharif, S. Bhagavatula, L. Bauer, M. K. Reiter. Accessorize to a crime: Real and stealthy attacks on state-of-the-art face recognition. In Proceedings of ACM SIGSAC Conference on Computer and Communications Security, ACM, Vienna, Austria, pp. 1528–1540, 2016. DOI: https://doi.org/10.1145/2976749.2978392.

    Google Scholar 

  31. K. D. Xu, G. Y. Zhang, S. J. Liu, Q. F. Fan, M. S. Sun, H. G. Chen, P. Y. Chen, Y. Z. Wang, X. Lin. Adversarial t-shirt! evading person detectors in a physical world. In Proceedings of the 16th European Conference on Computer Vision, Springer, Glasgow, UK, pp. 665–681, 2020. DOI: https://doi.org/10.1007/978-3-030-58558-739.

    Google Scholar 

  32. R. Feinman, R. R. Curtin, S. Shintre, A. B. Gardner. Detecting adversarial samples from artifacts. [Online], Available: https://arxiv.org/abs/1703.00410, 2017.

  33. X. J. Ma, B. Li, Y. S. Wang, S. M. Erfani, S. N. R. Wijewickrema, G. Schoenebeck, D. Song, M. E. Houle, J. Bailey. Characterizing adversarial subspaces using local intrinsic dimensionality. In Proceedings of the 6th International Conference on Learning Representations, Vancouver, Canada, 2018.

  34. T. Yu, S. Y. Hu, C. Guo, W. L. Chao, K. Q. Weinberger. A new defense against adversarial images: Turning a weakness into a strength. In Proceedings of the 33rd Conference on Neural Information Processing Systems, Vancouver, Canada, pp. 1633–1644, 2019.

  35. N. Papernot, P. McDaniel. Deep k-nearest neighbors: Towards confident, interpretable and robust deep learning. [Online], Available: https://arxiv.org/abs/1803.04765, 2018.

  36. K. Lee, K. Lee, H. Lee, J. Shin. A simple unified framework for detecting out-of-distribution samples and adversarial attacks. In Proceedings of the Annual Conference on Neural Information Processing Systems, Montreal, Canada, pp. 7167–7177, 2018.

  37. A. Kurakin, I. J. Goodfellow, S. Bengio. Adversarial machine learning at scale. In Proceedings of the 5th International Conference on Learning Representations, Toulon, France, 2017.

  38. T. Na, J. H. Ko, S. Mukhopadhyay. Cascade adversarial machine learning regularized with a unified embedding. In Proceedings of the 6th International Conference on Learning Representations, Vancouver, Canada, 2018.

  39. F. Tramèr, A. Kurakin, N. Papernot, I. J. Goodfellow, D. Boneh, P. D. McDaniel. Ensemble adversarial training: Attacks and defenses. In Proceedings of the 6th International Conference on Learning Representations, Vancouver, Canada, 2018.

  40. A. S. Ross, F. Doshi-Velez. Improving the adversarial robustness and interpretability of deep neural networks by regularizing their input gradients. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence, the 30th Innovative Applications of Artificial Intelligence Conference and the 8th AAAI Symposium on Educational Advances in Artificial Intelligence, New Orleans, USA, pp. 203, 2018.

  41. G. Cazenavette, C. Murdock, S. Lucey. Architectural adversarial robustness: The case for deep pursuit. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Nashville, USA, pp. 7146–7154, 2021. DOI: https://doi.org/10.1109/CVPR46437.2021.00707.

    Google Scholar 

  42. N. Papernot, P. McDaniel, X. Wu, S. Jha, A. Swami. Distillation as a defense to adversarial perturbations against deep neural networks. In Proceedings of IEEE Symposium on Security and Privacy, IEEE, San Jose, USA, pp. 582–597, 2016. DOI: https://doi.org/10.1109/SP.2016.41.

    Google Scholar 

  43. X. Y. Cao, N. Q. Gong. Mitigating evasion attacks to deep neural networks via region-based classification. In Proceedings of the 33rd Annual Computer Security Applications Conference, ACM, Orlando, USA, pp. 278–287, 2017. DOI: https://doi.org/10.1145/3134600.3134606.

    Google Scholar 

  44. H. Lee, S. Han, J. Lee. Generative adversarial trainer: Defense to adversarial perturbations with GAN. [Online], Available: https://arxiv.org/abs/1705.03387, 2017.

  45. Y. Jang, T. C. Zhao, S. Hong, H. Lee. Adversarial defense via learning to generate diverse attacks. In Proceedings of IEEE/CVF International Conference on Computer Vision, IEEE, Seoul, Korea, pp. 2740–2749, 2019. DOI: https://doi.org/10.1109/ICCV.2019.00283.

    Google Scholar 

  46. M. Moayeri, S. Feizi. Sample efficient detection and classification of adversarial attacks via self-supervised embeddings. In Proceedings of IEEE/CVF International Conference on Computer Vision, IEEE, Montreal, Canada, pp. 7657–7666, 2021. DOI: https://doi.org/10.1109/ICCV48922.2021.00758.

    Google Scholar 

  47. N. Das, M. Shanbhogue, S. T. Chen, F. Hohman, L. Chen, M. E. Kounavis, D. H. Chau. Keeping the bad guys out: Protecting and vaccinating deep learning with JPEG compression. [Online], Available: https://arxiv.org/abs/1705.02900, 2017.

  48. T. Salimans, A. Karpathy, X. Chen, D. P. Kingma. Pixel-CNN++: Improving the PixelCNN with discretized logistic mixture likelihood and other modifications. In Proceed- ings of the 5th International Conference on Learning Representations, Toulon, France, 2017.

  49. Y. Song, T. Kim, S. Nowozin, S. Ermon, N. Kushman. PixelDefend: Leveraging generative models to understand and defend against adversarial examples. In Proceedings of the 6th International Conference on Learning Representations, Vancouver, Canada, 2018.

  50. Y. Bai, Y. Feng, Y. S. Wang, T. Dai, S. T. Xia, Y. Jiang. Hilbert-based generative defense for adversarial examples. In Proceedings of IEEE/CVF International Conference on Computer Vision, IEEE, Seoul, Korea, pp. 4783–4792, 2019. DOI: https://doi.org/10.1109/ICCV.2019.00488.

    Google Scholar 

  51. S. M. Moosavi-Dezfooli, A. Shrivastava, O. Tuzel. Divide, denoise, and defend against adversarial attacks. [Online], Available: https://arxiv.org/abs/1802.06806, 2018.

  52. B. Sun, N. H. Tsai, F. C. Liu, R. Yu, H. Su. Adversarial defense by stratified convolutional sparse coding. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Long Beach, USA, pp. 11439–11448, 2019. DOI: https://doi.org/10.1109/CVPR.2019.01171.

    Google Scholar 

  53. D. W. Zhou, N. N. Wang, C. L. Peng, X. B. Gao, X. Y. Wang, J. Yu, T. L. Liu. Removing adversarial noise in class activation feature space. In Proceedings of IEEE/CVF International Conference on Computer Vision, IEEE, Montreal, Canada, pp. 7858–7867, 2021. DOI: https://doi.org/10.1109/ICCV48922.2021.00778.

    Google Scholar 

  54. J. Pearl, M. Glymour, N. P. Jewell. Causal Inference in Statistics: A Primer, Chichester, UK: John Wiley & Sons, 2016.

    MATH  Google Scholar 

  55. B. Scholkopf. Causality for machine learning. Probabilistic and Causal Inference: The Works of Judea Pearl, pp. 765–804, 2022.

  56. K. Chalupka, P. Perona, F. Eberhardt. Visual causal feature learning. In Proceedings of the 31st Conference on Uncertainty in Artificial Intelligence, Amsterdam, Netherlands, pp. 181–190, 2015.

  57. T. Wang, J. Q. Huang, H. W. Zhang, Q. R. Sun. Visual commonsense R-CNN. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Seattle, USA, pp. 10757–10767, 2020. DOI: https://doi.org/10.1109/CVPR42600.2020.01077.

    Google Scholar 

  58. Z. Q. Yue, H. W. Zhang, Q. R. Sun, X. S. Hua. Interventional few-shot learning. In Proceedings of the 34th Conference on Neural Information Processing Systems, Vancouver, Canada, pp. 2734–2746, 2020.

  59. K. H. Tang, J. Q. Huang, H. W. Zhang. Long-tailed classification by keeping the good and removing the bad momentum causal effect. In Proceedings of the 34th Conference on Neural Information Processing Systems, Vancouver, Canada, pp. 1513–1524, 2020.

  60. D. Zhang, H. W. Zhang, J. H. Tang, X. S. Hua, Q. R. Sun. Causal intervention for weakly-supervised semantic segmentation. In Proceedings of the 34th Conference on Neural Information Processing Systems, Vancouver, Canada, pp. 655–666, 2020.

  61. L. Chen, X. Yan, J. Xiao, H. W. Zhang, S. L. Pu, Y. T. Zhuang. Counterfactual samples synthesizing for robust visual question answering. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Seattle, USA, pp. 10797–10806, 2020. DOI: https://doi.org/10.1109/CVPR42600.2020.01081.

    Google Scholar 

  62. W. J. Zhang, L. Liu, J. Y. Li. Robust multi-instance learning with stable instances. In Proceedings of the 24th European Conference on Artificial Intelligence, Santiago de Compostela, Spain, pp. 1682–1689, 2020.

  63. C. Wang, X. T. Lu, W. Wang. A theoretical analysis based on causal inference and single-instance learning. Applied Intelligence, to be published. DOI: https://doi.org/10.1007/s10489-022-03193-0.

  64. H. Hu, Z. Zhang, Z. D. Xie, S. Lin. Local relation networks for image recognition. In Proceedings of IEEE/CVF International Conference on Computer Vision, IEEE, Seoul, Korea, pp. 3463–3472, 2019. DOI: https://doi.org/10.1109/ICCV.2019.00356.

    Google Scholar 

  65. P. Ramachandran, N. Parmar, A. Vaswani, I. Bello, A. Levskaya, J. Shlens. Stand-alone self-attention in vision models. In Proceedings of the 33rd Conference on Neural Information Processing Systems, Vancouver, Canada, pp. 68–80, 2019.

  66. H. S. Zhao, J. Y. Jia, V. Koltun. Exploring self-attention for image recognition. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Seattle, USA, pp. 10073–10082, 2021. DOI: https://doi.org/10.1109/CVPR42600.2020.01009.

    Google Scholar 

  67. A. Srinivas, T. Y. Lin, N. Parmar, J. Shlens, P. Abbeel, A. Vaswani. Bottleneck transformers for visual recognition. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Nashville, USA, pp. 16514–16524, 2021. DOI: https://doi.org/10.1109/CVPR46437.2021.01625.

    Google Scholar 

  68. J. Y. Gu, H. Hu, L. W. Wang, Y. C. Wei, J. F. Dai. Learning region features for object detection. In Proceedings of the 15th European Conference on Computer Vision, Springer, Munich, Germany, pp. 392–406, 2018. DOI: https://doi.org/10.1007/978-3-030-01258-8_24.

    Google Scholar 

  69. H. Touvron, M. Cord, M. Douze, F. Massa, A. Sablayrolles, H. Jégou. Training data-efficient image transformers & distillation through attention. In Proceedings of the 38th International Conference on Machine Learning, pp. 10347–10357, 2021.

  70. L. Yuan, Y. P. Chen, T. Wang, W. H. Yu, Y. J. Shi, Z. H. Jiang, F. E. H. Tay, J. S. Feng, S. C. Yan. Tokens-to-token ViT: Training vision transformers from scratch on ImageNet. In Proceedings of IEE/CVF International Conference on Computer Vision, IEEE, Montreal, Canada, pp. 538–547, 2021. DOI: https://doi.org/10.1109/ICCV48922.2021.00060.

    Google Scholar 

  71. X. X. Chu, B. Zhang, Z. Tian, X. L. Wei, H. X. Xia. Do we really need explicit position encodings for vision transformers? [Online], Available: https://arxiv.org/abs/2102.10882, 2021.

  72. J. Pearl. Direct and indirect effects. Probabilistic and Causal Inference: The Works of Judea Pearl, pp. 373–392, 2022.

  73. A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N Gomez, L. Kaiser, I. Polosukhin. Attention is all you need. In Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, USA, pp. 6000–6010, 2017.

  74. A. Krizhevsky. Learning Multiple Layers of Features from Tiny Images. Citeseer, 2009.

  75. J. Deng, W. Dong, R. Socher, L. J. Li, K. Li, Fei-Fei Li. ImageNet: A large-scale hierarchical image database. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Miami, USA, pp. 248–255, 2009. DOI: https://doi.org/10.1109/CVPR.2009.5206848.

    Google Scholar 

  76. A. Kurakin, I. J. Goodfellow, S. Bengio. Adversarial examples in the physical world. Artificial Intelligence Safety and Security, R. V. Yampolskiy, Ed., New York, USA: Chapman and Hall/CRC, pp. 1–14, 2018.

    Google Scholar 

  77. F. Croce, M. Hein. Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks. In Proceedings of the 37th International Conference on Machine Learning, pp. 2206–2216, 2020.

  78. S. Komkov, A. Petiushko. AdvHat: Real-world adversarial attack on ArcFace face ID system. In Proceedings of the 25th International Conference on Pattern Recognition, IEEE, Milan, Italy, pp. 819–826, 2021. DOI: https://doi.org/10.1109/ICPR48806.2021.9412236.

    Google Scholar 

Download references

Acknowledgements

This work was supported by National Key Research and Development Program of China (No. 2020AAA 0140002), Natural Science Foundation of China (Nos. U1836217, 62076240, 62006225, 61906199, 62071468, 62176025 and U21B200389), and the CAAI-Huawei Mindspore Open Fund.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yun-Long Wang.

Additional information

Colored figures are available in the online version at https://link.springer.com/journal/11633

Min Ren received the B. Eng. degree in mechanical engineering and automation from National University of Defense Technology, China in 2013. Currently, he is a Ph. D. degree candidate with School of Artificial Intelligence, University of Chinese Academy of Sciences, China, and Center for Research on Intelligent Perception and Computing (CRIPAC), National Laboratory of Pattern Recognition (NLPR), Institute of Automation, Chinese Academy of Sciences (CASIA), China.

His research interests include pattern recognition, computer vision and biometrics.

Yun-Long Wang received the Ph.D. degree in pattern recognition and intelligent systems from Department of Automation, University of Science and Technology of China, China in 2019. He is currently an associate professor with CRIPAC, NLPR, CASIA, China.

His research interests include pattern recognition, machine learning, light-field photography, and biometrics.

Zhao-Feng He received the Ph. D. degree in pattern recognition and intelligent systems from CASIA, China in 2010. Currently, he is a professor at Beijing University of Posts and Telecommunications (BUPT) and is the founder of the Laboratory of Visual Computing and Intelligent System (VCIS), China.

His research interests include biometrics, computer vision, and intelligent system.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Ren, M., Wang, YL. & He, ZF. Towards Interpretable Defense Against Adversarial Attacks via Causal Inference. Mach. Intell. Res. 19, 209–226 (2022). https://doi.org/10.1007/s11633-022-1330-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11633-022-1330-7

Keywords

  • Adversarial sample
  • adversarial defense
  • causal inference
  • interpretable machine learning
  • transformers