Translating Numerical Concepts for PDEs into Neural Architectures

Alt, Tobias; Peter, Pascal; Weickert, Joachim; Schrader, Karl

doi:10.1007/978-3-030-75549-2_24

Tobias Alt¹³,
Pascal Peter¹³,
Joachim Weickert¹³ &
…
Karl Schrader¹³

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12679))

Included in the following conference series:

International Conference on Scale Space and Variational Methods in Computer Vision

1221 Accesses
6 Citations

Abstract

We investigate what can be learned from translating numerical algorithms into neural networks. On the numerical side, we consider explicit, accelerated explicit, and implicit schemes for a general higher order nonlinear diffusion equation in 1D, as well as linear multigrid methods. On the neural network side, we identify corresponding concepts in terms of residual networks (ResNets), recurrent networks, and U-nets. These connections guarantee Euclidean stability of specific ResNets with a transposed convolution layer structure in each block. We present three numerical justifications for skip connections: as time discretisations in explicit schemes, as extrapolation mechanisms for accelerating those methods, and as recurrent connections in fixed point solvers for implicit schemes. Last but not least, we also motivate uncommon design choices such as nonmonotone activation functions. Our findings give a numerical perspective on the success of modern neural network architectures, and they provide design criteria for stable networks.

This work has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement no. 741215, ERC Advanced Grant INCOVID).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Alt, T., Weickert, J., Peter, P.: Translating diffusion, wavelets, and regularisation into residual networks. arXiv:2002.02753v3 [cs.LG] (Jun 2020)
Briggs, W.L., Henson, V.E., McCormick, S.F.: A Multigrid Tutorial, 2nd edn. SIAM, Philadelphia (2000)
Book Google Scholar
Chen, Y., Pock, T.: Trainable nonlinear reaction diffusion: a flexible framework for fast and effective image restoration. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1256–1272 (2016)
Article Google Scholar
De Felice, P., Marangi, C., Nardulli, G., Pasquariello, G., Tedesco, L.: Dynamics of neural networks with non-monotone activation function. Netw. Comput. Neural Syst. 4(1), 1–9 (1993)
Google Scholar
Didas, S., Weickert, J., Burgeth, B.: Properties of higher order nonlinear diffusion filtering. J. Math. Imaging Vis. 35, 208–226 (2009)
Article MathSciNet Google Scholar
Eliasof, M., Ephrath, J., Ruthotto, L., Treister, E.: Multigrid-in-Channels neural network architectures. arXiv:2011.09128v2 [cs.CV] (Nov 2020)
Goodfellow, I., Warde-Farley, D., Mirza, M., Courville, A., Bengio, Y.: Maxout networks. In: Dasgupta, S., McAllester, D. (eds.) Proceedings of the 30th International Conference on Machine Learning. Proceedings of Machine Learning Research, Atlanta, GA, vol. 28, pp. 1319–1327, June 2013
Google Scholar
Greenfeld, D., Galun, M., Kimmel, R., Yavneh, I., Basri, R.: Learning to optimize multigrid PDE solvers. In: Chaudhuri, K., Salakhutdinov, R. (eds.) Proceedings of the 36th International Conference on Machine Learning. Proceedings of Machine Learning Research, Long Beach, CA, vol. 97, pp. 2415–2423, June 2019
Google Scholar
Hafner, D., Ochs, P., Weickert, J., Reißel, M., Grewenig, S.: FSI schemes: fast semi-iterative solvers for PDEs and optimisation methods. In: Rosenhahn, B., Andres, B. (eds.) GCPR 2016. LNCS, vol. 9796, pp. 91–102. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-45886-1_8
Chapter Google Scholar
He, J., Xu, J.: MgNet: a unified framework of multigrid and convolutional neural network. Sci. China Math. 62(7), 1331–1354 (2019). https://doi.org/10.1007/s11425-019-9547-2
Article MathSciNet MATH Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778. IEEE Computer Society Press, Las Vegas, June 2016
Google Scholar
Hopfield, J.J.: Neural networks and physical systems with emergent collective computational abilities. Proc. Natl. Acad. Sci. 79(8), 2554–2558 (1982)
Article MathSciNet Google Scholar
Huang, G., Liu, Z., van der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, pp. 4700–4708. IEEE Computer Society Press, Honolulu, July 2017
Google Scholar
Lu, Y., Zhong, A., Li, Q., Dong, B.: Beyond finite layer neural networks: bridging deep architectures and numerical differential equations. In: Dy, J., Krause, A. (eds.) Proceedings of the 35th International Conference on Machine Learning. Proceedings of Machine Learning Research, Stockholm, Sweden, vol. 80, pp. 3276–3285, Jul 2018
Google Scholar
Meilijson, I., Ruppin, E.: Optimal signalling in attractor neural networks. In: Tesauro, G., Touretzky, D., Leen, T. (eds.) Proceedings of the 7th Annual Conference on Neural Information Processing Systems. Advances in Neural Information Processing Systems, Denver, CO, vol. 7, pp. 485–492, December 1994
Google Scholar
Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9912, pp. 483–499. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46484-8_29
Chapter Google Scholar
Ochs, P., Meinhardt, T., Leal-Taixe, L., Moeller, M.: Lifting layers: analysis and applications. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11205, pp. 53–68. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01246-5_4
Chapter Google Scholar
Ouala, S., Pascual, A., Fablet, R.: Residual integration neural network. In: Proceedings of the 2019 IEEE International Conference on Acoustics, Speech and Signal Processing. pp. 3622–3626. IEEE Computer Society Press, Brighton, May 2019
Google Scholar
Perona, P., Malik, J.: Scale space and edge detection using anisotropic diffusion. IEEE Trans. Pattern Anal. Mach. Intell. 12, 629–639 (1990)
Article Google Scholar
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Chapter Google Scholar
Rousseau, F., Drumetz, L., Fablet, R.: Residual networks as flows of diffeomorphisms. J. Math. Imaging Vis. 62, 365–375 (2020)
Article MathSciNet Google Scholar
Ruthotto, L., Haber, E.: Deep neural networks motivated by partial differential equations. J. Math. Imaging Vis. 62, 352–364 (2020)
Article MathSciNet Google Scholar
Smets, B., Portegies, J., Bekkers, E., Duits, R.: PDE-based group equivariant convolutional neural networks. arXiv:2001.09046v2 [cs.LG], March 2020
Weickert, J.: Anisotropic Diffusion in Image Processing. Teubner, Stuttgart (1998)
MATH Google Scholar
Weickert, J., Benhamouda, B.: A semidiscrete nonlinear scale-space theory and its relation to the Perona-Malik paradox. In: Solina, F., Kropatsch, W.G., Klette, R., Bajcsy, R. (eds.) Advances in Computer Vision, pp. 1–10. Springer, Wien (1997)
Google Scholar
You, Y.L., Kaveh, M.: Fourth-order partial differential equations for noise removal. IEEE Trans. Image Process. 9(10), 1723–1730 (2000)
Article MathSciNet Google Scholar
Zhang, L., Schaeffer, H.: Forward stability of ResNet and its variants. J. Math. Imaging Vis. 62, 328–351 (2020)
Article MathSciNet Google Scholar

Download references

Acknowlegdements

We thank Matthias Augustin and Michael Ertel for fruitful discussions and feedback on our manuscript.

Author information

Authors and Affiliations

Mathematical Image Analysis Group, Faculty of Mathematics and Computer Science, Campus E1.7, Saarland University, 66041, Saarbrücken, Germany
Tobias Alt, Pascal Peter, Joachim Weickert & Karl Schrader

Authors

Tobias Alt
View author publications
You can also search for this author in PubMed Google Scholar
Pascal Peter
View author publications
You can also search for this author in PubMed Google Scholar
Joachim Weickert
View author publications
You can also search for this author in PubMed Google Scholar
Karl Schrader
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tobias Alt .

Editor information

Editors and Affiliations

UNICAEN, GREYC – Normandy University, Caen, France
Abderrahim Elmoataz
ENSICAEN, GREYC – Normandy University, Caen, France
Jalal Fadili
CNRS, GREYC – Normandy University, Caen, France
Yvain Quéau
UNICAEN, GREYC – Normandy University, Caen, France
Julien Rabin
ENSICAEN, GREYC – Normandy University, Caen, France
Loïc Simon

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Alt, T., Peter, P., Weickert, J., Schrader, K. (2021). Translating Numerical Concepts for PDEs into Neural Architectures. In: Elmoataz, A., Fadili, J., Quéau, Y., Rabin, J., Simon, L. (eds) Scale Space and Variational Methods in Computer Vision. SSVM 2021. Lecture Notes in Computer Science(), vol 12679. Springer, Cham. https://doi.org/10.1007/978-3-030-75549-2_24

Download citation

DOI: https://doi.org/10.1007/978-3-030-75549-2_24
Published: 30 April 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-75548-5
Online ISBN: 978-3-030-75549-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics