Complexity of Machine Learning

Gros, Claudius

doi:10.1007/978-3-031-55076-8_10

Claudius Gros²

77 Accesses

Abstract

Without doubt, the brain is the most complex adaptive system known to humanity, arguably also a complex system about which we know little. In both respects, the brain faces increasing competition from machine learning architectures.

We present an introduction to basic neural network and machine learning concepts, with a special focus on the connection to dynamical systems theory. Starting with point neurons and the XOR problem, the relation between the dynamics of recurrent networks and random matrix theory will be developed. The somewhat counter-intuitive notion of continuous numbers of network layers is shown next to lead to neural differential equations, respectively for information processing and error backpropagation. Approaches aimed at understanding learning processes in deep architectures make often use of the infinite-layer limit. As a result, machine learning can be described by Gaussian processes together with neural tangent kernels. Finally, the distinction between information processing and information routing will be discussed, with the latter being the task of the attention mechanism, the core component of transformer architectures.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Hardcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
See Sect. 9.5 of Chap. 9, for the general theory of piecewise linear dynamical systems.
2.
For the general theory of stochastic dynamical systems see Chap. 3.
3.
See Exercise (10.1).
4.
M. Minsky, S. Papert, “Perceptrons: An Introduction to Computational Geometry” (1969).
5.
Generic network theory is developed in Chap. 1.
6.
See Chap. 2.
7.
More about random variables in Chap. 5.
8.
Absorbing phase transitions are treated in depth in Sect. 6.4.1 of Chap. 6.
9.
See Sect. 6.1 of Chap. 6 for an introduction to the Landau theory of phase transitions. The connection is made in exercise (10.3).
10.
More about Bayesian statistics in Sect. 5.1 of Chap. 5.
11.
See exercise (10.7).
12.
See the discussion on time series characterization, Sect. 5.1.4 of Chap. 5.

References

Akjouj, I., et al. (2022). Complex systems in ecology: A guided tour with large Lotka-Volterra models and random matrices. arXiv:2212.06136.
Google Scholar
Biehl, M. (2023). The shallow and the deep: A biased introduction to neural networks and old school machine learning. University of Groningen Press.
Google Scholar
Chen, R. T., Rubanova, Y., Bettencourt, J., & Duvenaud, D. K. (2018). Neural ordinary differential equations. Advances in Neural Information Processing Systems, 31.
Google Scholar
Dauphin, Y. N., Fan, A., Auli, M., & Grangier, D. (2017). Language modeling with gated convolutional networks. PMLR, 70, 933–941.
Google Scholar
Gros, C. (2021). A Devil’s advocate view on ‘Self-Organized’ brain criticality. Journal of Physics: Complexity, 2, 2021.
Google Scholar
Jacot, A., Gabriel, F., & Hongler, C. (2018). Neural tangent kernel: Convergence and generalization in neural networks. Advances in Neural Information Processing Systems, 31, 2018.
Google Scholar
Lindsay, G. W. (2020). Attention in psychology, neuroscience, and machine learning. Frontiers in Computational Neuroscience, 14, 29.
Article Google Scholar
Sommers, H. J., Crisanti, A., Sompolinsky, H., & Stein, Y. (1988). Spectrum of large random asymmetric matrices. Physical Review Letters, 60, 1895.
Article ADS MathSciNet Google Scholar
Schubert, F., & Gros, C. (2021). Local homeostatic regulation of the spectral radius of echo-state networks. Frontiers in Computational Neuroscience, 15, 587721.
Article Google Scholar
Sun, Y. et al. (2023). Retentive network: A successor to transformer for large language models. arXiv:2307.08621.
Google Scholar
Vaswani, A. et al. (2017). Attention is all you need. In Advances in Neural Information Processing Systems (vol. 30).
Google Scholar
Williams, C. K., & Rasmussen, C. E. (2006). Gaussian processes for machine learning. MIT Press.
Google Scholar
Yu, Y., Si, X., Hu, C., & Zhang, J. (2019). A review of recurrent neural networks: LSTM cells and network architectures. Neural Computation, 31, 1235–1270.
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Institute for Theoretical Physics, Goethe University Frankfurt, Frankfurt/Main, Germany
Claudius Gros

Authors

Claudius Gros
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Gros, C. (2024). Complexity of Machine Learning. In: Complex and Adaptive Dynamical Systems. Springer, Cham. https://doi.org/10.1007/978-3-031-55076-8_10

Download citation

DOI: https://doi.org/10.1007/978-3-031-55076-8_10
Published: 14 May 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-55075-1
Online ISBN: 978-3-031-55076-8
eBook Packages: Physics and AstronomyPhysics and Astronomy (R0)

Publish with us

Policies and ethics