Abstract
We utilize the framework of topological data analysis to examine the geometry of loss landscape. With the use of topology and Morse theory, we propose to analyse 1-dimensional topological invariants as a measure of loss function non-convexity up to arbitrary re-parametrization. The proposed approach uses optimization of 2-dimensional simplices in network weights space and allows to conduct both qualitative and quantitative evaluation of loss landscape to gain insights into behavior and optimization of neural networks. We provide geometrical interpretation of the topological invariants and describe the algorithm for their computation. We expect that the proposed approach can complement the existing tools for analysis of loss landscape and shed light on unresolved issues in the field of deep learning.
REFERENCES
H. Li, Z. Xu, G. Taylor, C. Studer, and T. Goldstein, “Visualizing the loss landscape of neural nets,” Proceedings of the Conference on Neural Information Processing Systems (2018).
N. Park and S. Kim, “How do vision transformers work?,” Proceedings of the International Conference on Learning Representations (2022).
N. S. Keskar, D. Mudigere, J. Nocedal, M. Smelyanskiy, and P. T. P. Tang, “On large-batch training for deep learning: Generalization gap and sharp minima,” Proceedings of the International Conference on Learning Representations (2017).
L. Dinh, R. Pascanu, S. Bengio, and Y. Bengio, “Sharp minima can generalize for deep nets,” Proceedings of the International Conference on Machine Learning (2017).
F. Draxler, K. Veschgini, M. Salmhofer, and F. A. Hamprecht, “Essentially no barriers in neural network energy landscape,” Proceedings of the International Conference on Machine Learning (2018).
T. Garipov, P. Izmailov, D. Podoprikhin, D. Vetrov, A. G. Wilson, “Loss surfaces, mode connectivity, and fast ensembling of DNNs,” Proceedings of the Conference on Neural Information Processing Systems (2018).
G. W. Benton, W. J. Maddox, S. Lotfi, and A. G. Wilson, “Loss surface simplexes for mode connecting volumes and fast ensembling,” Proceedings of the International Conference on Machine Learning (2021).
S. Fort and S. Jastrzebski, “Large scale structure of neural network loss landscapes” (2019).
R. Entezari, H. Sedghi, O. Saukh, and B. Neyshabur, “The role of permutation invariance in linear mode connectivity of neural networks,” Proceedings of the International Conference on Learning Representations (2022).
S. K. Ainsworth, J. Hayase, S. Srinivasa, “Git Re-Basin: Merging models modulo permutation symmetries,” Proceedings of the International Conference on Learning Representations (2023).
Y. Yang, L. Hodgkinson, R. Theisen, J. Zou, J. E. Gonzalez, K. Ramchandran, and M. W. Mahoney, “Taxonomizing local versus global structure in neural network loss landscapes,” Proceedings of the Conference on Neural Information Processing Systems (2021).
I. J. Goodfellow, O. Vinyals, and A. M. Saxe, “Qualitatively characterizing neural network optimization problems,” Proceedings of the International Conference on Learning Representations (2015).
L. N. Smith and N. Topin, “Exploring loss function topology with cyclical learning rates,” Proceedings of the International Conference on Learning Representations (2017).
H. He, G. Huang, and Y. Yuan, “Asymmetric valleys: Beyond sharp and flat local minima” (2019). https://doi.org/10.48550/arXiv.1902.00744
A. Gotmare, N. S. Keskar, C. Xiong, and R. Socher, “Using mode connectivity for loss landscape analysis” (2018). https://doi.org/10.48550/arXiv.1806.06977
I. Skorokhodov and M. Burtsev, “Loss surface sightseeing by multi-point optimization,” Proceedings of the Conference on Neural Information Processing Systems (2019).
W. M. Czarnecki, S. Osindero, R. Pascanu, and M. Jaderberg, “A deep neural network’s loss surface contains every low-dimensional pattern” (2020). https://doi.org/10.48550/arXiv.1912.07559
S. Barannikov, A. Korotin, D. Oganesyan, D. Emtsev, and E. Burnaev, “Barcodes as summary of loss function’s topology” (2020). https://arxiv.org/pdf/1912.00043.pdf
R. Kuditipudi, X. Wang, H. Lee, Y. Zhang, Z. Li, W. Hu, S. Arora, and R. Ge, “Explaining landscape connectivity of low-cost solutions for multilayer nets,” Proceedings of the Conference on Neural Information Processing Systems (2019).
C. Frédéric and M. Bertrand, “An introduction to topological data analysis: Fundamental and practical aspects for data scientists,” Front. Art. Intell. 4, 108 (2021).
D. Le Peutrec, F. Nier, and C. Viterbo, “Precise Arrhenius law for p-forms: The Witten Laplacian and Morse–Barannikov complex,” Ann. Henri Poincaré 14, 567–610 (2013).
A. J. Zomorodian, “Computing and comprehending topology: Persistence and hierarchical Morse complexes,” PhD Thesis (2001).
S. Barannikov, “Framed Morse complexes and its invariants,” Adv. Sov. Math. 21, 93–116 (1994).
Funding
This work was partially supported by the Next Generation Program (3rd Call for Proposals).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
The authors of this work declare that they have no conflicts of interest.
Additional information
Publisher’s Note.
Pleiades Publishing remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Voronkova, D.S., Barannikov, S.A. & Burnaev, E.V. 1-Dimensional Topological Invariants to Estimate Loss Surface Non-Convexity. Dokl. Math. 108 (Suppl 2), S325–S332 (2023). https://doi.org/10.1134/S1064562423701569
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1134/S1064562423701569