Toward Adversarial Robustness by Diversity in an Ensemble of Specialized Deep Neural Networks

Abbasi, Mahdieh; Rajabi, Arezoo; Gagné, Christian; Bobba, Rakesh B.

doi:10.1007/978-3-030-47358-7_1

Mahdieh Abbasi¹⁰,
Arezoo Rajabi¹¹,
Christian Gagné^10,12 &
…
Rakesh B. Bobba¹¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12109))

Included in the following conference series:

Canadian Conference on Artificial Intelligence

2524 Accesses
3 Citations
8 Altmetric

Abstract

We aim at demonstrating the influence of diversity in the ensemble of CNNs on the detection of black-box adversarial instances and hardening the generation of white-box adversarial attacks. To this end, we propose an ensemble of diverse specialized CNNs along with a simple voting mechanism. The diversity in this ensemble creates a gap between the predictive confidences of adversaries and those of clean samples, making adversaries detectable. We then analyze how diversity in such an ensemble of specialists may mitigate the risk of the black-box and white-box adversarial examples. Using MNIST and CIFAR-10, we empirically verify the ability of our ensemble to detect a large portion of well-known black-box adversarial examples, which leads to a significant reduction in the risk rate of adversaries, at the expense of a small increase in the risk rate of clean samples. Moreover, we show that the success rate of generating white-box attacks by our ensemble is remarkably decreased compared to a vanilla CNN and an ensemble of vanilla CNNs, highlighting the beneficial role of diversity in the ensemble for developing more robust models.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
For convenience, \(\mathcal {W}\) is dropped from \(h_\mathcal {W}(\cdot )\).

References

Guo, C., Pleiss, G., Sun, Y., Weinberger, K.Q.: On calibration of modern neural networks. In: Proceedings of the 34th International Conference on Machine Learning-Volume 70, JMLR. org (2017), pp. 1321–1330 (2017)
Google Scholar
Eykholt, K., et al.: Robust physical-world attacks on deep learning models. arXiv preprint arXiv:1707.08945 (2017)
Madry, A., Makelov, A., Schmidt, L., Tsipras, D., Vladu, A.: Towards deep learning models resistant to adversarial attacks. arXiv preprint arXiv:1706.06083 (2017)
Metzen, J.H., Genewein, T., Fischer, V., Bischoff, B.: On detecting adversarial perturbations. In: Proceedings of the 5th International Conference on Learning Representations (ICLR) (2017)
Google Scholar
Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572 (2014)
Liao, F., Liang, M., Dong, Y., Pang, T., Zhu, J., Hu, X.: Defense against adversarial attacks using high-level representation guided denoiser. arXiv preprint arXiv:1712.02976 (2017)
Kurakin, A., Goodfellow, I., Bengio, S.: Adversarial examples in the physical world. arXiv preprint arXiv:1607.02533 (2016)
Feinman, R., Curtin, R.R., Shintre, S., Gardner, A.B.: Detecting adversarial samples from artifacts. arXiv preprint arXiv:1703.00410 (2017)
Grosse, K., Manoharan, P., Papernot, N., Backes, M., McDaniel, P.: On the (statistical) detection of adversarial examples. arXiv preprint arXiv:1702.06280 (2017)
Lee, K., Lee, K., Lee, H., Shin, J.: A simple unified framework for detecting out-of-distribution samples and adversarial attacks. In: Advances in Neural Information Processing Systems, pp. 7167–7177 (2018)
Google Scholar
Zhang, H., Chen, H., Song, Z., Boning, D., Dhillon, I.S., Hsieh, C.J.: The limitations of adversarial training and the blind-spot attack. arXiv preprint arXiv:1901.04684 (2019)
Tramèr, F., Boneh, D.: Adversarial training and robustness for multiple perturbations. In: Proceedings of the Neural Information Processing Systems (NeurIPS) (2019)
Google Scholar
Liu, Y., Chen, X., Liu, C., Song, D.: Delving into transferable adversarial examples and black-box attacks. arXiv preprint arXiv:1611.02770 (2016)
Szegedy, C., et al.: Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199 (2013)
Charles, Z., Rosenberg, H., Papailiopoulos, D.: A geometric perspective on the transferability of adversarial directions. arXiv preprint arXiv:1811.03531 (2018)
Bendale, A., Boult, T.E.: Towards open set deep networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp.1563–1572 (2016)
Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
Moosavi-Dezfooli, S.M., Fawzi, A., Frossard, P.: Deepfool: a simple and accurate method to fool deep neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)
Google Scholar
Carlini, N., Wagner, D.: Towards evaluating the robustness of neural networks. In: 2017 IEEE Symposium on Security and Privacy (SP), pp. 39–57. IEEE (2017)
Google Scholar
He, W., Wei, J., Chen, X., Carlini, N., Song, D.: Adversarial example defenses: ensembles of weak defenses are not strong. arXiv preprint arXiv:1706.04701 (2017)
Huang, R., Xu, B., Schuurmans, D., Szepesvári, C.: Learning with a strong adversary. arXiv preprint arXiv:1511.03034 (2015)
Tramèr, F., Kurakin, A., Papernot, N., Boneh, D., McDaniel, P.: Ensemble adversarial training: attacks and defenses. arXiv preprint arXiv:1705.07204 (2017)
Rozsa, A., Rudd, E.M., Boult, T.E.: Adversarial diversity and hard positive generation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 25–32 (2016)
Google Scholar
Lu, J., Issaranon, T., Forsyth, D.: Safetynet: detecting and rejecting adversarial examples robustly. In: The IEEE International Conference on Computer Vision (ICCV), October 2017
Google Scholar
Meng, D., Chen, H.: Magnet: a two-pronged defense against adversarial examples (2017)
Google Scholar
Strauss, T., Hanselmann, M., Junginger, A., Ulmer, H.: Ensemble methods as a defense to adversarial perturbations against deep neural networks. arXiv preprint arXiv:1709.03423 (2017)
Kariyappa, S., Qureshi, M.K.: Improving adversarial robustness of ensembles with diversity training. arXiv preprint arXiv:1901.09981 (2019)

Download references

Acknowledgements

This work was funded by NSERC-Canada, Mitacs, and Prompt-Québec. We thank Annette Schwerdtfeger for proofreading the paper.

Author information

Authors and Affiliations

IID, Université Laval, Québec, Canada
Mahdieh Abbasi & Christian Gagné
Oregon State University, Corvallis, USA
Arezoo Rajabi & Rakesh B. Bobba
Mila, Canada CIFAR AI Chair, Quebec city, Canada
Christian Gagné

Authors

Mahdieh Abbasi
View author publications
You can also search for this author in PubMed Google Scholar
Arezoo Rajabi
View author publications
You can also search for this author in PubMed Google Scholar
Christian Gagné
View author publications
You can also search for this author in PubMed Google Scholar
Rakesh B. Bobba
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mahdieh Abbasi .

Editor information

Editors and Affiliations

National Research Council Canada, Ottawa, ON, Canada
Cyril Goutte
Queen’s University, Kingston, ON, Canada
Xiaodan Zhu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Abbasi, M., Rajabi, A., Gagné, C., Bobba, R.B. (2020). Toward Adversarial Robustness by Diversity in an Ensemble of Specialized Deep Neural Networks. In: Goutte, C., Zhu, X. (eds) Advances in Artificial Intelligence. Canadian AI 2020. Lecture Notes in Computer Science(), vol 12109. Springer, Cham. https://doi.org/10.1007/978-3-030-47358-7_1

Download citation

DOI: https://doi.org/10.1007/978-3-030-47358-7_1
Published: 06 May 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-47357-0
Online ISBN: 978-3-030-47358-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics