Advertisement

Robust Neural Networks Inspired by Strong Stability Preserving Runge-Kutta Methods

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 12354)

Abstract

Deep neural networks have achieved state-of-the-art performance in a variety of fields. Recent works observe that a class of widely used neural networks can be viewed as the Euler method of numerical discretization. From the numerical discretization perspective, Strong Stability Preserving (SSP) methods are more advanced techniques than the explicit Euler method that produce both accurate and stable solutions. Motivated by the SSP property and a generalized Runge-Kutta method, we proposed Strong Stability Preserving networks (SSP networks) which improve robustness against adversarial attacks. We empirically demonstrate that the proposed networks improve the robustness against adversarial examples without any defensive methods. Further, the SSP networks are complementary with a state-of-the-art adversarial training scheme. Lastly, our experiments show that SSP networks suppress the blow-up of adversarial perturbations. Our results open up a way to study robust architectures of neural networks leveraging rich knowledge from numerical discretization literature.

Notes

Acknowledgements

This work was supported by Institute of Information & communications Technology Planning & Evaluation (IITP) grant funded by the Korea government (MSIT) (No. 2019-0-00533), National Supercomputing Center with supercomputing resources including technical support (KSC-2019-CRE-0186), National Research Foundation of Korea (NRF-2020R1A2C3010638), and Simons Foundation Collaboration Grants for Mathematicians.

Supplementary material

504446_1_En_24_MOESM1_ESM.pdf (797 kb)
Supplementary material 1 (pdf 797 KB)

References

  1. 1.
    Athalye, A., Carlini, N., Wagner, D.: Obfuscated gradients give a false sense of security: Circumventing defenses to adversarial examples. In: ICML, pp. 274–283 (2018)Google Scholar
  2. 2.
    Ben-Tal, A., El Ghaoui, L., Nemirovski, A.: Robust Optimization, vol. 28. Princeton University Press, Princeton (2009)CrossRefGoogle Scholar
  3. 3.
    Buckman, J., Roy, A., Raffel, C., Goodfellow, I.: Thermometer encoding: One hot way to resist adversarial examples. In: ICLR (2018)Google Scholar
  4. 4.
    Butcher, J.C.: The Numerical Analysis of Ordinary Differential Equations. Runge Kutta and General Linear Methods. A Wiley-Interscience Publication, John Wiley & Sons Ltd, Chichester (1987)Google Scholar
  5. 5.
    Carlini, N., Wagner, D.: Towards evaluating the robustness of neural networks. In: 2017 IEEE Symposium on Security and Privacy (SP), pp. 39–57 (2017)Google Scholar
  6. 6.
    Chen, T.Q., Rubanova, Y., Bettencourt, J., Duvenaud, D.K.: Neural ordinary differential equations. In: NeurIPS, pp. 6572–6583 (2018)Google Scholar
  7. 7.
    Ciccone, M., Gallieri, M., Masci, J., Osendorfer, C., Gomez, F.: NAIS-net: stable deep networks from non-autonomous differential equations. In: NeurIPS, pp. 3025–3035 (2018)Google Scholar
  8. 8.
    Goodfellow, I., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. In: ICLR (2015)Google Scholar
  9. 9.
    Gottlieb, S., Shu, C.W.: Total variation diminishing Runge-Kutta schemes. Math. Comput. Am. Math. Soc. 67(221), 73–85 (1998)MathSciNetCrossRefGoogle Scholar
  10. 10.
    Gottlieb, S., Shu, C.W., Tadmor, E.: Strong stability-preserving high-order time discretization methods. SIAM Rev. 43(1), 89–112 (2001)MathSciNetCrossRefGoogle Scholar
  11. 11.
    Grathwohl, W., Chen, R.T., Betterncourt, J., Sutskever, I., Duvenaud, D.: FFJORD: Free-form continuous dynamics for scalable reversible generative models (2018). arXiv preprint arXiv:1810.01367
  12. 12.
    Harten, A.: High resolution schemes for hyperbolic conservation laws. J. Comput. Phys. 49(3), 357–393 (1983)MathSciNetCrossRefGoogle Scholar
  13. 13.
    Harten, A., Engquist, B., Osher, S., Chakravarthy, S.R.: Uniformly high-order accurate essentially nonoscillatory schemes. III. J. Comput. Phys. 71(2), 231–303 (1987)MathSciNetCrossRefGoogle Scholar
  14. 14.
    He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: surpassing human-level performance on imagenet classification. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1026–1034 (2015)Google Scholar
  15. 15.
    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016)Google Scholar
  16. 16.
    He, K., Zhang, X., Ren, S., Sun, J.: Identity mappings in deep residual networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 630–645. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46493-0_38CrossRefGoogle Scholar
  17. 17.
    Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: ICLR (2014)Google Scholar
  18. 18.
    Krizhevsky, A., Hinton, G.: Learning multiple layers of features from tiny images. Technical report, Citeseer (2009)Google Scholar
  19. 19.
    LeCun, Y., Cortes, C.: MNIST handwritten digit database (2010). http://yann.lecun.com/exdb/mnist/
  20. 20.
    Lu, Y., Zhong, A., Li, Q., Dong, B.: Beyond finite layer neural networks: bridging deep architectures and numerical differential equations. In: ICML, pp. 5181–5190 (2018)Google Scholar
  21. 21.
    Madry, A., Makelov, A., Schmidt, L., Tsipras, D., Vladu, A.: Towards deep learning models resistant to adversarial attacks. In: ICLR (2018)Google Scholar
  22. 22.
    Papernot, N., McDaniel, P., Wu, X., Jha, S., Swami, A.: Distillation as a defense to adversarial perturbations against deep neural networks. In: 2016 IEEE Symposium on Security and Privacy (SP), pp. 582–597. IEEE (2016)Google Scholar
  23. 23.
    Raff, E., Sylvester, J., Forsyth, S., McLean, M.: Barrage of random transforms for adversarially robust defense. In: CVPR June 2019Google Scholar
  24. 24.
    Rubanova, Y., Chen, R.T., Duvenaud, D.: Latent odes for irregularly-sampled time series (2019). arXiv preprint arXiv:1907.03907
  25. 25.
    Ruthotto, L., Haber, E.: Deep neural networks motivated by partial differential equations (2018). arXiv preprint arXiv:1804.04272
  26. 26.
    Samangouei, P., Kabkab, M., Chellappa, R.: Defense-GAN: protecting classifiers against adversarial attacks using generative models. In: ICLR (2018)Google Scholar
  27. 27.
    Shu, C.W.: Total-variation-diminishing time discretizations. SIAM J. Sci. Stat. Comput. 9(6), 1073–1084 (1988)MathSciNetCrossRefGoogle Scholar
  28. 28.
    Shu, C.W., Osher, S.: Efficient implementation of essentially non-oscillatory shock-capturing schemes. J. Comput. Phys. 77(2), 439–471 (1988)MathSciNetCrossRefGoogle Scholar
  29. 29.
    Song, Y., Kim, T., Nowozin, S., Ermon, S., Kushman, N.: Pixeldefend: Leveraging generative models to understand and defend against adversarial examples. In: ICLR (2018)Google Scholar
  30. 30.
    Szegedy, C., et al.: Intriguing properties of neural networks (2013). arXiv preprint arXiv:1312.6199
  31. 31.
    Tsipras, D., Santurkar, S., Engstrom, L., Turner, A., Madry, A.: Robustness may be at odds with accuracy. In: ICLR (2019)Google Scholar
  32. 32.
    Wang, Y., Ma, X., Bailey, J., Yi, J., Zhou, B., Gu, Q.: On the convergence and robustness of adversarial training. In: ICML, pp. 6586–6595 (2019)Google Scholar
  33. 33.
    Wong, E., Kolter, Z.: Provable defenses against adversarial examples via the convex outer adversarial polytope. In: ICML, pp. 5283–5292 (2018)Google Scholar
  34. 34.
    Wong, E., Schmidt, F., Metzen, J.H., Kolter, J.Z.: Scaling provable adversarial defenses. In: NeurIPS, pp. 8400–8409 (2018)Google Scholar
  35. 35.
    Xie, C., Wu, Y., Maaten, L.v.d., Yuille, A.L., He, K.: Feature denoising for improving adversarial robustness. In: CVPR, pp. 501–509 (2019)Google Scholar
  36. 36.
    Xu, H., Caramanis, C., Mannor, S.: Robustness and regularization of support vector machines. J. Mach. Learn. Res. 10(7), 1485–1510 (2009)MathSciNetzbMATHGoogle Scholar
  37. 37.
    Yang, Y., Zhang, G., Xu, Z., Katabi, D.: Me-net: Towards effective adversarial robustness with matrix estimation. In: ICML, pp. 7025–7034 (2019)Google Scholar
  38. 38.
    Zhang, H., Yu, Y., Jiao, J., Xing, E., Ghaoui, L.E., Jordan, M.: Theoretically principled trade-off between robustness and accuracy. In: ICML, pp. 7472–7482 (2019)Google Scholar
  39. 39.
    Zhang, H., Dauphin, Y.N., Ma, T.: Residual learning without normalization via better initialization. In: International Conference on Learning Representations (2019)Google Scholar
  40. 40.
    Zhang, X., Li, Z., Change Loy, C., Lin, D.: Polynet: a pursuit of structural diversity in very deep networks. In: CVPR, pp. 718–726 (2017)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.Department of Computer ScienceKorea UniversitySeoulRepublic of Korea
  2. 2.Department of Mathematics and StatisticsSan Diego State UniversitySan DiegoUSA

Personalised recommendations