Skip to main content

How to Compare Adversarial Robustness of Classifiers from a Global Perspective

  • Conference paper
  • First Online:
Artificial Neural Networks and Machine Learning – ICANN 2021 (ICANN 2021)

Abstract

Adversarial robustness of machine learning models has attracted considerable attention over recent years. Adversarial attacks undermine the reliability of and trust in machine learning models, but the construction of more robust models hinges on a rigorous understanding of adversarial robustness as a property of a given model. Point-wise measures for specific threat models are currently the most popular tool for comparing the robustness of classifiers and are used in most recent publications on adversarial robustness. In this work, we use robustness curves to show that point-wise measures fail to capture important global properties that are essential to reliably compare the robustness of different classifiers. We introduce new ways in which robustness curves can be used to systematically uncover these properties and provide concrete recommendations for researchers and practitioners when assessing and comparing the robustness of trained models. Furthermore, we characterize scale as a way to distinguish small and large perturbations, and relate it to inherent properties of data sets, demonstrating that robustness thresholds must be chosen accordingly. We hope that our work contributes to a shift of focus away from point-wise measures of robustness and towards a discussion of the question what kind of robustness could and should reasonably be expected. We release code to reproduce all experiments presented in this paper, which includes a Python module to calculate robustness curves for arbitrary data sets and classifiers, supporting a number of frameworks, including TensorFlow, PyTorch and JAX.

N. Risse, C. Göpfert and J. P. Göpfert—Equal contribution.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Single thresholds: [1, 4, 10, 26, 29, 31,32,33, 35, 36, 41, 42], multiple thresholds: [3, 16, 20, 24, 38], full analysis: [7, 21, 25, 27, 28].

  2. 2.

    The full code is available at www.github.com/niklasrisse/how-to-compare-adversarial-robustness-of-classifiers-from-a-global-perspective.

  3. 3.

    The models trained with ST, KW, AT and MMR + AT are avaible at www.github.com/max-andr/provable-robustness-max-linear-regions.

  4. 4.

    The models trained with MMR-UNIV are avaible at www.github.com/fra31/mmr-universal.

References

  1. Alayrac, J.-B., Uesato, J., Huang, P.-S., Fawzi, A., Stanforth, R., Kohli, P.: Are labels required for improving adversarial robustness? In: NeurIPS (2019)

    Google Scholar 

  2. Anguita, D., Ghio, A., Oneto, L., Parra, X., Reyes-Ortiz, J.: A public domain dataset for human activity recognition using smartphones. In: ESANN (2013)

    Google Scholar 

  3. Boopathy, A., et al.: Proper network interpretability helps adversarial robustness in classification. In: ICML (2020)

    Google Scholar 

  4. Brendel, W., Rauber, J., Kümmerer, M., Ustyuzhaninov, I., Bethge, M.: Accurate, reliable and fast robustness evaluation. In: NeurIPS (2019)

    Google Scholar 

  5. Carlini, N., et al.: On evaluating adversarial robustness (2019). arXiv: 1902.06705

  6. Carlini, N., Wagner, D.A.: Towards evaluating the robustness of neural networks. In: 2017 IEEE Symposium on Security and Privacy (SP) (2017)

    Google Scholar 

  7. Carmon, Y., Raghunathan, A., Schmidt, L., Liang, P., Duchi, J.C.: Unlabeled data improves adversarial robustness. In: NeurIPS (2019)

    Google Scholar 

  8. Cohen, J., Rosenfeld, E., Kolter, Z.: Certified adversarial robustness via randomized smoothing. In: ICML (2019)

    Google Scholar 

  9. Croce, F., Andriushchenko, M., Hein, M.: Provable robustness of ReLU networks via maximization of linear regions. In: PMLR (2019)

    Google Scholar 

  10. Croce, F., Hein, M.: Provable robustness against all adversarial l\(_{p}\)-perturbations for p \(\ge \) 1. In: International Conference on Learning Representations (2020)

    Google Scholar 

  11. Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. In: 3rd International Conference on Learning Representations (2015)

    Google Scholar 

  12. Göpfert, C., Göpfert, J.P., Hammer, B.: Adversarial robustness curves. In: Machine Learning and Knowledge Discovery in Databases (2020)

    Google Scholar 

  13. Göpfert, J.P., Artelt, A., Wersing, H., Hammer, B.: Adversarial attacks hidden in plain sight. In: Symposium on Intelligent Data Analysis (2020)

    Google Scholar 

  14. Guo, C., Rana, M., Cisse, M., van der Maaten, L.: Countering adversarial images using input transformations (2017). arXiv: 1711.00117

  15. Hein, M., Andriushchenko, M.: Formal guarantees on the robustness of a classifier against adversarial manipulation (2017). arXiv: 1705.08475

  16. Hendrycks, D., Mazeika, M., Kadavath, S., Song, D.: Using self-supervised learning can improve model robustness and uncertainty. In: NeurIPS (2019)

    Google Scholar 

  17. Houben, S., Stallkamp, J., Salmen, J., Schlipsing, M., Igel, C.: Detection of traffic signs in real-world images: the german traffic sign detection benchmark. In: IJCNN (2013)

    Google Scholar 

  18. Krizhevsky, A.: Learning multiple layers of features from tiny images. Technical report (2009)

    Google Scholar 

  19. Lecuyer, M., Atlidakis, V., Geambasu, R., Hsu, D., Jana, S.: Certified robustness to adversarial examples with differential privacy. In: 2019 IEEE Symposium on Security and Privacy (SP) (2019)

    Google Scholar 

  20. Lee, G.-H., Yuan, Y., Chang, S., Jaakkola, T.: Tight certificates of adversarial robustness for randomly smoothed classifiers. In: NeurIPS (2019)

    Google Scholar 

  21. Li, B., Chen, C., Wang, W., Carin, L.: Certified adversarial robustness with additive noise. In: NeurIPS (2019)

    Google Scholar 

  22. Li, F.-F., Karpathy, A., Johnson, J.: CS231n: convolutional neural networks for visual recognition (2016). http://cs231n.stanford.edu/2016/project.html. Accessed 28 Mar 2020

  23. Madry, A., Makelov, A., Schmidt, L., Tsipras, D., Vladu, A.: Towards deep learning models resistant to adversarial attacks. In: ICLR (2018)

    Google Scholar 

  24. Mahloujifar, S., Zhang, X., Mahmoody, M., Evans, D.: Empirically measuring concentration: fundamental limits on intrinsic robustness. In: NeurIPS (2019)

    Google Scholar 

  25. Maini, P., Wong, E., Kolter, Z.: Adversarial robustness against the union of multiple threat models. In: ICML (2020)

    Google Scholar 

  26. Mao, C., Zhong, Z., Yang, J., Vondrick, C., Ray, B.: Metric learning for adversarial robustness. In: NeurIPS (2019)

    Google Scholar 

  27. Najafi, A., Maeda, S.-I., Koyama, M., Miyato, T.: Robustness to adversarial perturbations in learning from incomplete data. In: NeurIPS (2019)

    Google Scholar 

  28. Pinot, R., et al.: Theoretical evidence for adversarial robustness through randomization. In: NeurIPS (2019)

    Google Scholar 

  29. Qin, C., et al.: Adversarial robustness through local linearization. In: NeurIPS (2019)

    Google Scholar 

  30. Rauber, J., Brendel, W., Bethge, M.: Foolbox: a Python toolbox to benchmark the robustness of machine learning models (2017). arXiv: 1707.04131

  31. Rice, L., Wong, E., Kolter, Z.: Overfitting in adversarially robust deep learning. In: ICML (2020)

    Google Scholar 

  32. Singla, S., Feizi, S.: Second-order provable defenses against adversarial attacks. In: ICML (2020)

    Google Scholar 

  33. Song, C., He, K., Lin, J., Wang, L., Hopcroft, J.E.: Robust local features for improving the generalization of adversarial training. In: ICLR (2020)

    Google Scholar 

  34. Szegedy, C., et al.: Intriguing properties of neural networks (2014). arXiv: 1312.6199

  35. Tramer, F., Boneh, D.: Adversarial training and robustness for multiple perturbations. In: NeurIPS (2019)

    Google Scholar 

  36. Wang, Y., Zou, D., Yi, J., Bailey, J., Ma, X., Gu, Q.: Improving adversarial robustness requires revisiting misclassified examples. In: ICLR (2020)

    Google Scholar 

  37. Wong, E., Kolter, Z.: Provable defenses against adversarial examples via the convex outer adversarial polytope. In: ICML (2018)

    Google Scholar 

  38. Wong, E., Rice, L., Kolter, J.Z.: Fast is better than free: revisiting adversarial training. In: ICLR (2020)

    Google Scholar 

  39. Wong, E., Schmidt, F.R., Kolter, J.Z.: Wasserstein adversarial examples via projected Sinkhorn iterations. In: ICML (2019)

    Google Scholar 

  40. Xiao, H., Rasul, K., Vollgraf, R.: Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms (2017). arXiv: 1708.07747

  41. Xie, C., Yuille, A.: Intriguing properties of adversarial training at scale. In: ICLR (2020)

    Google Scholar 

  42. Zhang, J., et al.: Attacks which do not kill training make adversarial learning stronger. In: ICML (2020)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Christina Göpfert .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Risse, N., Göpfert, C., Göpfert, J.P. (2021). How to Compare Adversarial Robustness of Classifiers from a Global Perspective. In: Farkaš, I., Masulli, P., Otte, S., Wermter, S. (eds) Artificial Neural Networks and Machine Learning – ICANN 2021. ICANN 2021. Lecture Notes in Computer Science(), vol 12891. Springer, Cham. https://doi.org/10.1007/978-3-030-86362-3_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-86362-3_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-86361-6

  • Online ISBN: 978-3-030-86362-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics