Skip to main content

Translating Numerical Concepts for PDEs into Neural Architectures

  • Conference paper
  • First Online:
Scale Space and Variational Methods in Computer Vision (SSVM 2021)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12679))

Abstract

We investigate what can be learned from translating numerical algorithms into neural networks. On the numerical side, we consider explicit, accelerated explicit, and implicit schemes for a general higher order nonlinear diffusion equation in 1D, as well as linear multigrid methods. On the neural network side, we identify corresponding concepts in terms of residual networks (ResNets), recurrent networks, and U-nets. These connections guarantee Euclidean stability of specific ResNets with a transposed convolution layer structure in each block. We present three numerical justifications for skip connections: as time discretisations in explicit schemes, as extrapolation mechanisms for accelerating those methods, and as recurrent connections in fixed point solvers for implicit schemes. Last but not least, we also motivate uncommon design choices such as nonmonotone activation functions. Our findings give a numerical perspective on the success of modern neural network architectures, and they provide design criteria for stable networks.

This work has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement no. 741215, ERC Advanced Grant INCOVID).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Alt, T., Weickert, J., Peter, P.: Translating diffusion, wavelets, and regularisation into residual networks. arXiv:2002.02753v3 [cs.LG] (Jun 2020)

  2. Briggs, W.L., Henson, V.E., McCormick, S.F.: A Multigrid Tutorial, 2nd edn. SIAM, Philadelphia (2000)

    Book  Google Scholar 

  3. Chen, Y., Pock, T.: Trainable nonlinear reaction diffusion: a flexible framework for fast and effective image restoration. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1256–1272 (2016)

    Article  Google Scholar 

  4. De Felice, P., Marangi, C., Nardulli, G., Pasquariello, G., Tedesco, L.: Dynamics of neural networks with non-monotone activation function. Netw. Comput. Neural Syst. 4(1), 1–9 (1993)

    Google Scholar 

  5. Didas, S., Weickert, J., Burgeth, B.: Properties of higher order nonlinear diffusion filtering. J. Math. Imaging Vis. 35, 208–226 (2009)

    Article  MathSciNet  Google Scholar 

  6. Eliasof, M., Ephrath, J., Ruthotto, L., Treister, E.: Multigrid-in-Channels neural network architectures. arXiv:2011.09128v2 [cs.CV] (Nov 2020)

  7. Goodfellow, I., Warde-Farley, D., Mirza, M., Courville, A., Bengio, Y.: Maxout networks. In: Dasgupta, S., McAllester, D. (eds.) Proceedings of the 30th International Conference on Machine Learning. Proceedings of Machine Learning Research, Atlanta, GA, vol. 28, pp. 1319–1327, June 2013

    Google Scholar 

  8. Greenfeld, D., Galun, M., Kimmel, R., Yavneh, I., Basri, R.: Learning to optimize multigrid PDE solvers. In: Chaudhuri, K., Salakhutdinov, R. (eds.) Proceedings of the 36th International Conference on Machine Learning. Proceedings of Machine Learning Research, Long Beach, CA, vol. 97, pp. 2415–2423, June 2019

    Google Scholar 

  9. Hafner, D., Ochs, P., Weickert, J., Reißel, M., Grewenig, S.: FSI schemes: fast semi-iterative solvers for PDEs and optimisation methods. In: Rosenhahn, B., Andres, B. (eds.) GCPR 2016. LNCS, vol. 9796, pp. 91–102. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-45886-1_8

    Chapter  Google Scholar 

  10. He, J., Xu, J.: MgNet: a unified framework of multigrid and convolutional neural network. Sci. China Math. 62(7), 1331–1354 (2019). https://doi.org/10.1007/s11425-019-9547-2

    Article  MathSciNet  MATH  Google Scholar 

  11. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778. IEEE Computer Society Press, Las Vegas, June 2016

    Google Scholar 

  12. Hopfield, J.J.: Neural networks and physical systems with emergent collective computational abilities. Proc. Natl. Acad. Sci. 79(8), 2554–2558 (1982)

    Article  MathSciNet  Google Scholar 

  13. Huang, G., Liu, Z., van der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, pp. 4700–4708. IEEE Computer Society Press, Honolulu, July 2017

    Google Scholar 

  14. Lu, Y., Zhong, A., Li, Q., Dong, B.: Beyond finite layer neural networks: bridging deep architectures and numerical differential equations. In: Dy, J., Krause, A. (eds.) Proceedings of the 35th International Conference on Machine Learning. Proceedings of Machine Learning Research, Stockholm, Sweden, vol. 80, pp. 3276–3285, Jul 2018

    Google Scholar 

  15. Meilijson, I., Ruppin, E.: Optimal signalling in attractor neural networks. In: Tesauro, G., Touretzky, D., Leen, T. (eds.) Proceedings of the 7th Annual Conference on Neural Information Processing Systems. Advances in Neural Information Processing Systems, Denver, CO, vol. 7, pp. 485–492, December 1994

    Google Scholar 

  16. Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9912, pp. 483–499. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46484-8_29

    Chapter  Google Scholar 

  17. Ochs, P., Meinhardt, T., Leal-Taixe, L., Moeller, M.: Lifting layers: analysis and applications. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11205, pp. 53–68. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01246-5_4

    Chapter  Google Scholar 

  18. Ouala, S., Pascual, A., Fablet, R.: Residual integration neural network. In: Proceedings of the 2019 IEEE International Conference on Acoustics, Speech and Signal Processing. pp. 3622–3626. IEEE Computer Society Press, Brighton, May 2019

    Google Scholar 

  19. Perona, P., Malik, J.: Scale space and edge detection using anisotropic diffusion. IEEE Trans. Pattern Anal. Mach. Intell. 12, 629–639 (1990)

    Article  Google Scholar 

  20. Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28

    Chapter  Google Scholar 

  21. Rousseau, F., Drumetz, L., Fablet, R.: Residual networks as flows of diffeomorphisms. J. Math. Imaging Vis. 62, 365–375 (2020)

    Article  MathSciNet  Google Scholar 

  22. Ruthotto, L., Haber, E.: Deep neural networks motivated by partial differential equations. J. Math. Imaging Vis. 62, 352–364 (2020)

    Article  MathSciNet  Google Scholar 

  23. Smets, B., Portegies, J., Bekkers, E., Duits, R.: PDE-based group equivariant convolutional neural networks. arXiv:2001.09046v2 [cs.LG], March 2020

  24. Weickert, J.: Anisotropic Diffusion in Image Processing. Teubner, Stuttgart (1998)

    MATH  Google Scholar 

  25. Weickert, J., Benhamouda, B.: A semidiscrete nonlinear scale-space theory and its relation to the Perona-Malik paradox. In: Solina, F., Kropatsch, W.G., Klette, R., Bajcsy, R. (eds.) Advances in Computer Vision, pp. 1–10. Springer, Wien (1997)

    Google Scholar 

  26. You, Y.L., Kaveh, M.: Fourth-order partial differential equations for noise removal. IEEE Trans. Image Process. 9(10), 1723–1730 (2000)

    Article  MathSciNet  Google Scholar 

  27. Zhang, L., Schaeffer, H.: Forward stability of ResNet and its variants. J. Math. Imaging Vis. 62, 328–351 (2020)

    Article  MathSciNet  Google Scholar 

Download references

Acknowlegdements

We thank Matthias Augustin and Michael Ertel for fruitful discussions and feedback on our manuscript.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tobias Alt .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Alt, T., Peter, P., Weickert, J., Schrader, K. (2021). Translating Numerical Concepts for PDEs into Neural Architectures. In: Elmoataz, A., Fadili, J., Quéau, Y., Rabin, J., Simon, L. (eds) Scale Space and Variational Methods in Computer Vision. SSVM 2021. Lecture Notes in Computer Science(), vol 12679. Springer, Cham. https://doi.org/10.1007/978-3-030-75549-2_24

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-75549-2_24

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-75548-5

  • Online ISBN: 978-3-030-75549-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics