Mathematical analysis of finite parameter deep neural network models with skip connections from the viewpoint of representation sets

Nagase, Jumpei

doi:10.1007/s13160-022-00541-y

Mathematical analysis of finite parameter deep neural network models with skip connections from the viewpoint of representation sets

Original Paper
Published: 25 September 2022

Volume 39, pages 1075–1093, (2022)
Cite this article

Japan Journal of Industrial and Applied Mathematics Aims and scope Submit manuscript

Jumpei Nagase¹

345 Accesses
Explore all metrics

Abstract

In this paper, we discuss representation capability for a systematic understanding of connection structures in deep neural networks. Deep learning is a machine learning method using deep neural networks, and various network structures have been proposed. Skip connections are one of the network structures proposed in the ResNet model and are now a significant architecture. Although skip connections are a straightforward structure and their effectiveness has been shown empirically, the connections in deep neural network models are not mathematically understood, and propositions for model structure are not systematic. In our approach, we discuss the problem from the perspective of function sets represented by neural network models with finite parameters and clarify the correspondence between models with different connection structures. We show that the structure of a variety of branching connections, such as those represented by skip connections, can be designed by a multilayer ReLU perceptron model that is a model with only series connections.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Review of deep learning: concepts, CNN architectures, challenges, applications, future directions

Article Open access 31 March 2021

Development and Application of Artificial Neural Network

Article 30 December 2017

Deep learning for time series classification: a review

Article 02 March 2019

References

Chen, R.T.Q., et al.: Neural ordinary differential equations. arXiv preprint, arXiv:1806.07366 (2018)
Cybenko, G.: Approximation by superpositions of a sigmoidal function. Math. Control Signals Syst. 2(4), 303–314 (1989)
Article MathSciNet MATH Google Scholar
Funahashi, K.: On the approximate realization of continuous mappings by neural networks. Neural Netw. 2(3), 183–192 (1989)
Article Google Scholar
Ge, Y., Samuel, S.: Mean field residual networks: on the edge of chaos. Adv. Neural Inf. Process. Syst. 30, 7103–7114 (2017)
Google Scholar
Glorot, X., et al.: Deep sparse rectifier neural networks. In: Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, vol. 15, pp. 315–323 (2011)
He, K., et al.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
He, K., et al.: Identity mappings in deep residual networks. In: European Conference on Computer Vision, pp. 630–645 (2016)
Hinton, G.E., et al.: A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1527–1554 (2006)
Article MathSciNet MATH Google Scholar
Huang, G., et al.: Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4700–4708 (2017)
Ioffe, S., Christian S.: Batch normalization: accelerating deep network training by reducing internal covariate shift. In: Proceedings of the 32th International Conference on Machine Learning, PMLR 37, pp. 448–456 (2015)
Krizhevsky, A., et al.: Imagenet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 25 (2012)
Liu, H., et al.: Pay attention to MLPs. Adv. Neural Inf. Process. Syst. 34, 9204–9215 (2021)
Google Scholar
Lu, Y., et al.: Beyond finite layer neural networks: bridging deep architectures and numerical differential equations. In: Proceedings of the 35th International Conference on Machine Learning, vol. 80, pp. 3282–3291 (2018)
Minsky, M., Papert, S.A.: Perceptrons: An Introduction to Computational Geometry. MIT Press, Cambridge (1969)
MATH Google Scholar
Nagase, J., Ishiwata, T.: Mathematical analysis and design based on representation capability of deep neural network with residual skip connection. Trans. Jpn. Soc. Ind. Appl. Math. 30(1), 45–70 (2020) (in Japanese)
Google Scholar
Nair, V., Geoffrey, H.: Rectified linear units Boltzmann machines. In: Proceedings of the 27th International Conference on Machine Learning, ICML-10, pp. 807–814 (2010)
Oono, K., Suzuki, T.: Approximation and non-parametric estimation of ResNet-type convolutional neural networks. In: Proceedings of the 36th International Conference on Machine Learning, vol. 97, pp. 4922–4931(2019)
Tolstikhin, I.O., et al.: MLP-Mixer: an all-MLP architecture for vision. Adv. Neural Inf. Process. Syst. 34
Veit, A., et al.: Residual networks behave like ensembles of relatively shallow networks. Adv. Neural Inf. Process. Syst. 29, 550–558 (2016)
Google Scholar
Zoph, B., Le, Q.V.: Neural architecture search with reinforcement learning. In: Proceedings of the 5th International Conference on Learning Representations (2017)

Download references

Acknowledgements

This work was supported by JSPS KAKENHI JP21J12812. I would like to thank Professor Tetsuya Ishiwata, my supervisor in the doctoral course, and everyone who generously gave of their time to discuss the various issues with me.

Author information

Authors and Affiliations

Graduate School of Engineering and Science, Shibaura Institute of Technology, 307 Fukasaku, Minuma-ku, Saitama, Saitama, 337-8570, Japan
Jumpei Nagase

Authors

Jumpei Nagase
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jumpei Nagase.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

About this article

Cite this article

Nagase, J. Mathematical analysis of finite parameter deep neural network models with skip connections from the viewpoint of representation sets. Japan J. Indust. Appl. Math. 39, 1075–1093 (2022). https://doi.org/10.1007/s13160-022-00541-y

Download citation

Received: 01 June 2021
Revised: 01 September 2022
Accepted: 12 September 2022
Published: 25 September 2022
Issue Date: December 2022
DOI: https://doi.org/10.1007/s13160-022-00541-y

Keywords

Mathematics Subject Classification

Deep learning - 68T07

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Mathematical analysis of finite parameter deep neural network models with skip connections from the viewpoint of representation sets

Abstract

Access this article

Similar content being viewed by others

Review of deep learning: concepts, CNN architectures, challenges, applications, future directions

Development and Application of Artificial Neural Network

Deep learning for time series classification: a review

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

Mathematical analysis of finite parameter deep neural network models with skip connections from the viewpoint of representation sets

Abstract

Access this article

Similar content being viewed by others

Review of deep learning: concepts, CNN architectures, challenges, applications, future directions

Development and Application of Artificial Neural Network

Deep learning for time series classification: a review

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation