Achieving Generalizable Robustness of Deep Neural Networks by Stability Training

Laermann, Jan; Samek, Wojciech; Strodthoff, Nils

doi:10.1007/978-3-030-33676-9_25

Jan Laermann¹¹,
Wojciech Samek¹¹ &
Nils Strodthoff¹¹

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 11824))

Included in the following conference series:

German Conference on Pattern Recognition

1894 Accesses
7 Citations
1 Altmetric

Abstract

We study the recently introduced stability training as a general-purpose method to increase the robustness of deep neural networks against input perturbations. In particular, we explore its use as an alternative to data augmentation and validate its performance against a number of distortion types and transformations including adversarial examples. In our image classification experiments using ImageNet data stability training performs on a par or even outperforms data augmentation for specific transformations, while consistently offering improved robustness against a broader range of distortion strengths and types unseen during training, a considerably smaller hyperparameter dependence and less potentially negative side effects compared to data augmentation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
A corresponding preprocessing script is available at https://gist.github.com/nstrodt/bd270131160f02564f0165e888976471.

References

Bachman, P., Alsharif, O., Precup, D.: Learning with pseudo-ensembles. In: Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N.D., Weinberger, K.Q. (eds.) NIPS, pp. 3365–3373. Curran Associates, Inc., New York (2014)
Google Scholar
Berthelot, D., Carlini, N., Goodfellow, I., Papernot, N., Oliver, A., Raffel, C.: MixMatch: a holistic approach to semi-supervised learning (2019)
Google Scholar
Carlini, N., et al.: On evaluating adversarial robustness. arXiv preprint: arXiv:1902.06705 (2019)
Chapelle, O., Schlkopf, B., Zien, A.: Semi-Supervised Learning. The MIT Press, Cambridge (2010)
Google Scholar
Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: AutoAugment: learning augmentation policies from data. arXiv preprint: arXiv:1805.09501 (2018)
Frénay, B., Verleysen, M.: Classification in the presence of label noise: a survey. IEEE Trans. Neural Netw. Learn. Syst. 25(5), 845–869 (2014)
Article Google Scholar
French, R.M.: Catastrophic forgetting in connectionist networks. Trends Cogn. Sci. 3(4), 128–135 (1994)
Article Google Scholar
Gal, Y.: Uncertainty in deep learning. Ph.D. thesis, University of Cambridge (2016)
Google Scholar
Goodfellow, I., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. In: International Conference on Learning Representations (2015)
Google Scholar
Hataya, R., Nakayama, H.: Unifying semi-supervised and robust learning by mixup. In: ICLR the 2nd Learning from Limited Labeled Data (LLD) Workshop (2019)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: surpassing human-level performance on ImageNet classification. In: CVPR, pp. 1026–1034 (2015)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016)
Google Scholar
Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. In: ICML, pp. 448–456 (2015)
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Pereira, F., Burges, C., Bottou, L., Weinberger, K. (eds.) NIPS, pp. 1097–1105. Curran Associates, Inc., New York (2012)
Google Scholar
Laine, S., Aila, T.: Temporal ensembling for semi-supervised learning. In: ICLR (2017)
Google Scholar
Miyato, T., Maeda, S.I., Koyama, M., Ishii, S.: Virtual adversarial training: a regularization method for supervised and semi-supervised learning. IEEE Trans. Pattern Anal. Mach. Intell. 41(8), 1979–1993 (2018)
Article Google Scholar
Montavon, G., Samek, W., Müller, K.R.: Methods for interpreting and understanding deep neural networks. Digit. Signal Process. 73, 1–15 (2018)
Article MathSciNet Google Scholar
Nesterov, Y.: A method for unconstrained convex minimization problem with the rate of convergence O(1/k\(^2\)). Dokl. AN USSR 269, 543–547 (1983)
Google Scholar
Oliver, A., Odena, A., Raffel, C., Cubuk, E.D., Goodfellow, I.J.: Realistic evaluation of deep semi-supervised learning algorithms. arXiv preprint: arXiv:1804.09170 (2018)
Paszke, A., et al.: Automatic differentiation in PyTorch. In: NIPS Autodiff Workshop (2017)
Google Scholar
Rajput, S., Feng, Z., Charles, Z., Loh, P.L., Papailiopoulos, D.: Does data augmentation lead to positive margin? arXiv preprint: arXiv:1905.03177 (2019)
Sajjadi, M., Javanmardi, M., Tasdizen, T.: Regularization with stochastic transformations and perturbations for deep semi-supervised learning. In: NIPS, pp. 1171–1179 (2016)
Google Scholar
Silver, D., et al.: Mastering the game of Go with deep neural networks and tree search. Nature 529(7587), 484–489 (2016)
Article Google Scholar
Stanford University’s CS231 course: Tiny ImageNet. https://tiny-imagenet.herokuapp.com/. Accessed 7 May 2019
Szegedy, C., et al.: Intriguing properties of neural networks. arXiv preprint: arXiv:1312.6199 (2013)
Tarvainen, A., Valpola, H.: Mean teachers are better role models: weight-averaged consistency targets improve semi-supervised deep learning results. In: NIPS (2017)
Google Scholar
Taylor, L., Nitschke, G.: Improving deep learning using generic data augmentation. arXiv preprint: arXiv:1708.06020 (2017)
Verma, V., Lamb, A., Kannala, J., Bengio, Y., Lopez-Paz, D.: Interpolation consistency training for semi-supervised learning. arXiv preprint: arXiv:1903.03825 (2019)
Xie, Q., Dai, Z., Hovy, E., Luong, M.T., Le, Q.V.: Unsupervised data augmentation. arXiv preprint: arXiv:1904.12848 (2019)
Yaeger, L.S., Lyon, R.F., Webb, B.J.: Effective training of a neural network character classifier for word recognition. In: Mozer, M.C., Jordan, M.I., Petsche, T. (eds.) NIPS, pp. 807–816. MIT Press, Cambridge (1997)
Google Scholar
Zhang, C., Cui, J., Yang, B.: Learning optimal data augmentation policies via Bayesian optimization for image classification tasks (2019)
Google Scholar
Zhang, H., Cisse, M., Dauphin, Y.N., Lopez-Paz, D.: mixup: Beyond empirical risk minimization. In: ICLR (2018)
Google Scholar
Zheng, S., Song, Y., Leung, T., Goodfellow, I.: Improving the robustness of deep neural networks via stability training. In: CVPR, pp. 4480–4488 (2016)
Google Scholar
Zhu, X.J.: Semi-supervised learning literature survey. Technical report, University of Wisconsin-Madison Department of Computer Sciences (2005)
Google Scholar

Download references

Acknowledgements

This work was supported by the Bundesministerium für Bildung und Forschung (BMBF) through the Berlin Big Data Center under Grant 01IS14013A and the Berlin Center for Machine Learning under Grant 01IS180371.

Author information

Authors and Affiliations

Fraunhofer Heinrich Hertz Institute, Einsteinufer 37, 10587, Berlin, Germany
Jan Laermann, Wojciech Samek & Nils Strodthoff

Authors

Jan Laermann
View author publications
You can also search for this author in PubMed Google Scholar
Wojciech Samek
View author publications
You can also search for this author in PubMed Google Scholar
Nils Strodthoff
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nils Strodthoff .

Editor information

Editors and Affiliations

TU Dortmund University, Dortmund, Germany
Gernot A. Fink
University of Hamburg, Hamburg, Germany
Simone Frintrop
University of Münster, Münster, Germany
Xiaoyi Jiang

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 329 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Laermann, J., Samek, W., Strodthoff, N. (2019). Achieving Generalizable Robustness of Deep Neural Networks by Stability Training. In: Fink, G., Frintrop, S., Jiang, X. (eds) Pattern Recognition. DAGM GCPR 2019. Lecture Notes in Computer Science(), vol 11824. Springer, Cham. https://doi.org/10.1007/978-3-030-33676-9_25

Download citation

DOI: https://doi.org/10.1007/978-3-030-33676-9_25
Published: 25 October 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-33675-2
Online ISBN: 978-3-030-33676-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics