Exploring the Geometry and Topology of Neural Network Loss Landscapes

Horoi, Stefan; Huang, Jessie; Rieck, Bastian; Lajoie, Guillaume; Wolf, Guy; Krishnaswamy, Smita

doi:10.1007/978-3-031-01333-1_14

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13205))

Included in the following conference series:

International Symposium on Intelligent Data Analysis

1274 Accesses
2 Altmetric

Abstract

Recent work has established clear links between the generalization performance of trained neural networks and the geometry of their loss landscape near the local minima to which they converge. This suggests that qualitative and quantitative examination of the loss landscape geometry could yield insights about neural network generalization performance during training. To this end, researchers have proposed visualizing the loss landscape through the use of simple dimensionality reduction techniques. However, such visualization methods have been limited by their linear nature and only capture features in one or two dimensions, thus restricting sampling of the loss landscape to lines or planes. Here, we expand and improve upon these in three ways. First, we present a novel “jump and retrain” procedure for sampling relevant portions of the loss landscape. We show that the resulting sampled data holds more meaningful information about the network’s ability to generalize. Next, we show that non-linear dimensionality reduction of the jump and retrain trajectories via PHATE, a trajectory and manifold-preserving method, allows us to visualize differences between networks that are generalizing well vs poorly. Finally, we combine PHATE trajectories with a computational homology characterization to quantify trajectory differences.

S. Horoi and J. Huang—Equal contribution.

G. Wolf and S. Krishnaswamy—Equal senior-author contribution.

This work was partially funded by NSERC CGSM & FRQNT B1X scholarships [S.H.]; NSERC Discovery Grant RGPIN-2018-04821, Samsung Research Support [G.L.]; and Canada CIFAR AI Chairs [G.L., G.W.]. The content is solely the responsibility of the authors and does not necessarily represent the views of the funding agencies.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 64.99; Price excludes VAT (USA)

Softcover Book: USD 84.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Amézquita, E.J., Quigley, M.Y., Ophelders, T., Munch, E., Chitwood, D.H.: The shape of things to come: topological data analysis and biology, from molecules to organisms. Dev. Dyn. 249(7), 816–833 (2020)
Article Google Scholar
Blum, A.L., Rivest, R.L.: Training a 3-node neural network is NP-complete. In: Hanson, S.J., Remmele, W., Rivest, R.L. (eds.) Machine Learning: From Theory to Applications. LNCS, vol. 661, pp. 9–28. Springer, Heidelberg (1993). https://doi.org/10.1007/3-540-56483-7_20
Chapter Google Scholar
Chaudhari, P., et al.: Entropy-SGD: biasing gradient descent into wide valleys. In: 5th International Conference on Learning Representations (ICLR) (2017)
Google Scholar
Cohen-Steiner, D., Edelsbrunner, H., Harer, J.: Stability of persistence diagrams. Discrete Comput. Geom. 37(1), 103–120 (2007)
Article MathSciNet Google Scholar
Coifman, R.R., Lafon, S.: Diffusion maps. Appl. Comput. Harmon. Anal. 21(1), 5–30 (2006)
Article MathSciNet Google Scholar
Dinh, L., Pascanu, R., Bengio, S., Bengio, Y.: Sharp minima can generalize for deep nets. In: Precup, D., Teh, Y.W. (eds.) Proceedings of the 34th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 70, pp. 1019–1028 (2017)
Google Scholar
Edelsbrunner, H., Harer, J.: Computational Topology: An Introduction. American Mathematical Society, Providence (2010)
MATH Google Scholar
Goodfellow, I.J., Vinyals, O., Saxe, A.M.: Qualitatively characterizing neural network optimization problems. arXiv preprint arXiv:1412.6544 (2014)
Gyulassy, A., Bremer, P.T., Hamann, B., Pascucci, V.: A practical approach to Morse-Smale complex computation: scalability and generality. IEEE Trans. Vis. Comput. Graph. 14(6), 1619–1626 (2008)
Article Google Scholar
Hensel, F., Moor, M., Rieck, B.: A survey of topological machine learning methods. Front. Artif. Intell. 4, 52 (2021)
Article Google Scholar
Hochreiter, S., Schmidhuber, J.: Flat minima. Neural Comput. 9(1), 1–42 (1997)
Article Google Scholar
Hofer, C., Kwitt, R., Niethammer, M., Uhl, A.: Deep learning with topological signatures. In: Advances in Neural Information Processing Systems (NeurIPS), vol. 30, pp. 1634–1644. Curran Associates, Inc. (2017)
Google Scholar
Hofer, C.D., Graf, F., Rieck, B., Niethammer, M., Kwitt, R.: Graph filtration learning. In: Proceedings of the 37th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 119, pp. 4314–4323 (2020)
Google Scholar
Horn, M., De Brouwer, E., Moor, M., Moreau, Y., Rieck, B., Borgwardt, K.: Topological graph neural networks. In: 10th International Conference on Learning Representations (ICLR) (2022)
Google Scholar
Im, D.J., Tao, M., Branson, K.: An empirical analysis of the optimization of deep network loss surfaces. arXiv preprint arXiv:1612.04010 (2016)
Keskar, N.S., Mudigere, D., Nocedal, J., Smelyanskiy, M., Tang, P.T.P.: On large-batch training for deep learning: generalization gap and sharp minima. In: 5th International Conference on Learning Representations (ICLR) (2017)
Google Scholar
Krizhevsky, A., Hinton, G., et al.: Learning multiple layers of features from tiny images. Technical report, MIT & NYU (2009)
Google Scholar
Li, H., Xu, Z., Taylor, G., Studer, C., Goldstein, T.: Visualizing the loss landscape of neural nets. In: Advances in Neural Information Processing Systems (NeurIPS), vol. 31, pp. 6389–6399. Curran Associates, Inc. (2018)
Google Scholar
van der Maaten, L., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008)
MATH Google Scholar
McInnes, L., Healy, J., Saul, N., Grossberger, L.: UMAP: uniform manifold approximation and projection. J. Open Sour. Softw. 3(29) (2018)
Google Scholar
Moon, K.R., et al.: Visualizing structure and transitions in high-dimensional biological data. Nat. Biotechnol. 37(12), 1482–1492 (2019)
Article Google Scholar
Rieck, B., Bock, C., Borgwardt, K.: A persistent Weisfeiler-Lehman procedure for graph classification. In: Proceedings of the 36th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 97, pp. 5448–5458 (2019)
Google Scholar
Rieck, B., et al.: Uncovering the topology of time-varying fMRI data using cubical persistence. In: Advances in Neural Information Processing Systems (NeurIPS), vol. 33, pp. 6900–6912. Curran Associates, Inc. (2020)
Google Scholar
Zagoruyko, S., Komodakis, N.: Wide residual networks. In: Richard, C. Wilson, E.R.H., Smith, W.A.P. (eds.) Proceedings of the British Machine Vision Conference (BMVC), pp. 87.1-87.12. BMVA Press (2016)
Google Scholar
Zhang, C., Bengio, S., Hardt, M., Recht, B., Vinyals, O.: Understanding deep learning requires rethinking generalization. In: 5th International Conference on Learning Representations (ICLR) (2017)
Google Scholar
Zhao, Q., Wang, Y.: Learning metrics for persistence-based summaries and applications for graph classification. In: Advances in Neural Information Processing Systems (NeurIPS), vol. 32, pp. 9855–9866. Curran Associates, Inc. (2019)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Mathematics and Statistics, Université de Montréal, Montréal, QC, Canada
Stefan Horoi, Guillaume Lajoie & Guy Wolf
Mila - Quebec Artificial Intelligence Institute, Montréal, QC, Canada
Stefan Horoi, Guillaume Lajoie & Guy Wolf
Department of Computer Science, Yale University, New Haven, CT, USA
Jessie Huang & Smita Krishnaswamy
Institute of AI for Health, Helmholtz Munich, Munich, Germany
Bastian Rieck
Department of Genetics, Yale University, New Haven, CT, USA
Smita Krishnaswamy

Authors

Stefan Horoi
View author publications
You can also search for this author in PubMed Google Scholar
Jessie Huang
View author publications
You can also search for this author in PubMed Google Scholar
Bastian Rieck
View author publications
You can also search for this author in PubMed Google Scholar
Guillaume Lajoie
View author publications
You can also search for this author in PubMed Google Scholar
Guy Wolf
View author publications
You can also search for this author in PubMed Google Scholar
Smita Krishnaswamy
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Smita Krishnaswamy .

Editor information

Editors and Affiliations

University of Rennes, Rennes, France
Tassadit Bouadi
University of Rennes, Rennes, France
Elisa Fromont
University of Munich, LMU, Munich, Germany
Eyke Hüllermeier

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Horoi, S., Huang, J., Rieck, B., Lajoie, G., Wolf, G., Krishnaswamy, S. (2022). Exploring the Geometry and Topology of Neural Network Loss Landscapes. In: Bouadi, T., Fromont, E., Hüllermeier, E. (eds) Advances in Intelligent Data Analysis XX. IDA 2022. Lecture Notes in Computer Science, vol 13205. Springer, Cham. https://doi.org/10.1007/978-3-031-01333-1_14

Download citation

DOI: https://doi.org/10.1007/978-3-031-01333-1_14
Published: 07 April 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-01332-4
Online ISBN: 978-3-031-01333-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics