Skip to main content

Recent advances in algebraic geometry and Bayesian statistics

Abstract

This article is a review of theoretical advances in the research field of algebraic geometry and Bayesian statistics in the last two decades. Many statistical models and learning machines which contain hierarchical structures or latent variables are called nonidentifiable, because the map from a parameter to a statistical model is not one-to-one. In nonidentifiable models, both the likelihood function and the posterior distribution have singularities in general, hence it was difficult to analyze their statistical properties. However, from the end of the 20th century, new theory and methodology based on algebraic geometry have been established which enable us to investigate such models and machines in the real world. In this article, the following results in recent advances are reported. First, we explain the framework of Bayesian statistics and introduce a new perspective from the birational geometry. Second, two mathematical solutions are derived based on algebraic geometry. An appropriate parameter space can be found by a resolution map, which makes the posterior distribution be normal crossing and the log likelihood ratio function be well-defined. Third, three applications to statistics are introduced. The posterior distribution is represented by the renormalized form, the asymptotic free energy is derived, and the universal formula among the generalization loss, the cross validation, and the information criterion is established. Two mathematical solutions and three applications to statistics based on algebraic geometry reported in this article are now being used in many practical fields in data science and artificial intelligence.

This is a preview of subscription content, access via your institution.

Data availibility statement

Data sharing is not applicable to this article as no data sets were generated or analyzed during the current study.

References

  1. Akaike, H.: A new look at the statistical model identification. IEEE Trans. Autom. Control 19(6), 716–723 (1974)

    Article  MathSciNet  MATH  Google Scholar 

  2. Akaike, H.: On the transition of the paradigm of statistical inference. Proc. Inst. Stat. Math. 27, 5–12 (1980)

    MathSciNet  MATH  Google Scholar 

  3. Amari, S.: Differential and algebraic geometry in multilayer perceptrons. IEICE Trans. Fundam. E84–A, 31–38 (2001)

    Google Scholar 

  4. Amari, S., Fujita, N., Shinomoto, S.: Four types of leaning curves. Neural Comput. 4, 605–618 (1992)

    Article  Google Scholar 

  5. Amari, S., Murata, N.: Statistical theory of learning curves under entropic loss criterion. Neural Comput. 5, 140–153 (1993)

    Article  Google Scholar 

  6. Aoyagi, M., Watanabe, S.: Stochastic complexities of reduced rank regression in Bayesian estimation. Neural Netw. 18, 924–933 (2005)

    Article  MATH  Google Scholar 

  7. Aoyagi, M.: Stochastic complexity and generalization error of a restricted Boltzmann machine in Bayesian estimation. J. Mach. Learn. Res. 11, 1243–1272 (2010)

    MathSciNet  MATH  Google Scholar 

  8. Aoyagi, M., Nagata, K.: Learning coefficient of generalization error in Bayesian estimation and Vandermonde matrix type singularity. Neural Comput. 24(6), 1569–1610 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  9. Atiyah, M.F.: Resolution of singularities and division of distributions. Commun. Pure Appl. Math. 23, 145–150 (1970)

    Article  MathSciNet  MATH  Google Scholar 

  10. Binmore, K.: On the foundations of decision theory. Homo Oecon. 34, 259–273 (2017)

    Article  Google Scholar 

  11. Box, G.E.P.: Science and statistics. J. Am. Stat. Assoc. 71, 791–799 (1976)

    Article  MathSciNet  MATH  Google Scholar 

  12. Drton, M., Plummer, M.: A Bayesian information criterion for singular models. J. R. Stat. Soc. Ser. B 56, 1–38 (2017)

    MATH  Google Scholar 

  13. Epifani, I., MacEchern, S.N., Peruggia, M.: Case-Deletion importance sampling estimators: Central limit theorems and related results. Elec. J. Stat. 2, 774–806 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  14. Fukumizu, K.: A regularity condition of the information matrix of a multilayer perceptron network. Neural Netw. 9, 871–879 (1996)

    Article  Google Scholar 

  15. Gelfand, A.E., Dey, D.K., Chang, H.: Model determination using predictive distributions with implementation via sampling-based method. Technical Report, Department of statistics, Stanford University, 462, 147-167 (1992)

  16. Gelman, A., Carlin, J.B., Stern, H.S., Dunson, D.B., Vehtari, A., Rubin, D.B.: Bayesian Data Analysis III. CRC Press, Florida (2013)

    Book  MATH  Google Scholar 

  17. Gelman, A., Shalizi, C.S.: Philosophy and the practice of Bayesian statistics. Br. J. Math. Stat. Psychol. 66, 8–38 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  18. Gelman, A., Hwang, J., Vehtari, A.: Understanding predictive information criteria for Bayesian models. Stat. Comput. 24, 997–1016 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  19. Hagiwara, K., Toda, N., Usui, S.: On the problem of applying AIC to determine the structure of a layered feedforward neural network. Proc. of 1993 International Conference on Neural Networks, 3, 2263-2266 (1993)

  20. Hartigan, J.A.: A failure of likelihood asymptotics for normal mixtures. Proc. of Berkeley Conference in Honor of J. Neyman and J. Kiefer, 2, 807-810 (1985)

  21. Hayashi, N., Watanabe, S.: Upper bound of Bayesian generalization error in non-negative matrix factorization. Neurocomputing 266, 21–28 (2017)

    Article  Google Scholar 

  22. Hayashi, N.: The exact asymptotic form of Bayesian generalization error in latent Dirichlet allocation. Neural Netw. 137, 127–137 (2021)

    Article  Google Scholar 

  23. Hironaka, H.: Resolution of singularities of an algebraic variety over a field of characteristic zero. I, II. Ann. Math. 79, 109–326 (1964)

    Article  MathSciNet  MATH  Google Scholar 

  24. Kariya, N., Watanabe, S.: Asymptotic analysis of singular likelihood ratio of normal mixture by Bayesian learning theory for testing homogeneity. Commun. Stat. Theory Methods 51, 1–18 (2020)

    MathSciNet  MATH  Google Scholar 

  25. Kariya, N., Watanabe, S.: Testing homogeneity for normal mixture models: variational Bayes approach. IEICE Trans. Fundam Electron Commun. Comput. Sci. 103, 1274–1282 (2020)

    Article  Google Scholar 

  26. Kashiwara, M.: B-functions and holonomic systems. Rationality of roots of B-functions. Invent. Math. 38, 33–53 (1976)

    Article  MathSciNet  MATH  Google Scholar 

  27. Kollár, J.: Singularities of pairs, Proceedings of Symp. Pure Math., A.M.S. 62, Part 1, 221-287 (1997)

  28. McElreath, S.: Statistical Rethinking: A Bayesian Course With Examples in R and STAN, 2nd edn. CRC Press, Florida (2020)

    Book  Google Scholar 

  29. Murata, N., Yoshizawa, S., Amari, S.: Network information criterion-determining the number of hidden units for an artificial neural network model. IEEE Trans. Neural Netw. 5, 865–872 (1995)

    Article  Google Scholar 

  30. Nagata, K., Watanabe, S.: Asymptotic behavior of exchange ratio in exchange Monte Carlo method. Neural Netw. 21(7), 980–988 (2008)

    Article  MATH  Google Scholar 

  31. Nagayasu, S., Watanabe, S.: Asymptotic behavior of free energy when optimal probability distribution is not unique. Neurocomputing 500, 528–536 (2022)

    Article  Google Scholar 

  32. Nakajima, S., Watanake, K., Sugiyama, M.: Variational Bayesian Learning Theory. Cambridge University Press, Cambridge (2019)

    Book  Google Scholar 

  33. Peruggia, M.: On the variability of case-detection importance sampling weights in the Bayesian linear model. J. Am. Stat. Assoc. 92, 199–207 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  34. Saito, M.: On real log canonical thresholds, arxiv:0707.2308, (2007)

  35. Sato, K., Watanabe, S.: Bayesian generalization error of Poisson mixture and simplex Vandermonde matrix type singularity. arXiv:1912.13289, (2019)

  36. Schwarz, G.: Estimating the dimension of a model. Ann. Stat. 6, 461–464 (1978)

    Article  MathSciNet  MATH  Google Scholar 

  37. Spiegelhalter, D.J., Best, N.G., Carlin, B.P., Linde, A.: Bayesian measures of model complexity and fit. J. R. Stat. Soc. Ser. B 64(4), 583–639 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  38. Vehtari, A., Lampinen, J.: Bayesian model assessment and comparison using cross-validation predictive densities. Neural Comput. 14(10), 2439–2468 (2002)

    Article  MATH  Google Scholar 

  39. Vehtari, A., Gelman, A., Gabry, J.: Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC. Stat. Comput. 27(5), 1413–1432 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  40. Watanabe, K., Watanabe, S.: Stochastic complexities of Gaussian mixtures in variational Bayesian approximation. J. Mach. Learn. Res. 7, 625–644 (2006)

    MathSciNet  MATH  Google Scholar 

  41. Watanabe, S.: A generalized Bayesian framework for neural networks with singular Fisher information matrices. Proc. of International Symposium on Nonlinear Theory and Its Applications, 207-210 (1995)

  42. Watanabe, S.: Algebraic analysis for singular statistical estimation. Lect. Notes Comput. Sci. 1720, 39–50 (1999)

    Article  MathSciNet  MATH  Google Scholar 

  43. Watanabe, S.: Algebraic geometrical methods for hierarchical learning machines. Neural Netw. 14, 1049–1060 (2001)

    Article  Google Scholar 

  44. Watanabe, S.: Learning efficiency of redundant neural networks in Bayesian estimation. IEEE Trans. Neural Netw. 12, 1475–1486 (2001)

    Article  Google Scholar 

  45. Watanabe, S.: Algebraic analysis for nonidentifiable learning machines. Neural Comput. 13, 899–933 (2001)

    Article  MATH  Google Scholar 

  46. Watanabe, S., Amari, S.: Learning coefficients of layered models when the true distribution mismatches the singularities. Neural Comput. 15, 1013–1033 (2003)

    Article  MATH  Google Scholar 

  47. Watanabe, S.: Almost all learning machines are singular. IEEE Symposium on Foundations of Computational Intelligence, 383-388 (2017)

  48. Watanabe, S.: Algebraic Geometry and Statistical Learning Theory. Cambridge University Press, Cambridge (2009)

    Book  MATH  Google Scholar 

  49. Watanabe, S.: Asymptotic equivalence of Bayes cross validation and widely applicable information criterion in singular learning theory. J. Mach. Learn. Res. 11, 3571–3594 (2010)

    MathSciNet  MATH  Google Scholar 

  50. Watanabe, S.: Asymptotic learning curve and renormalizable condition in statistical learning theory. J. Phys. Conf. Ser. 233, 012014 (2010)

    Article  Google Scholar 

  51. Watanabe, S.: A widely applicable Bayesian information criterion. J. Mach. Learn. Res. 14, 867–897 (2013)

    MathSciNet  MATH  Google Scholar 

  52. Watanabe, S.: Mathematical Theory of Bayesian Statistics. CRC Press, Florida (2018)

    Book  MATH  Google Scholar 

  53. Watanabe, S.: Higher order equivalence of Bayes cross validation and WAIC, pp. 47–73. Springer Proceedings in Mathematics and Statistics, Information Geometry and Its Applications (2018)

  54. Watanabe, S.: WAIC and WBIC for mixture models. Behaviormetrika (2021). https://doi.org/10.1007/s41237-021-00133-z

  55. Watanabe, S.: Information criteria and cross validation for Bayesian inference in regular and singular cases. Jpn. J. Stat. Data Sci. 4, 1–19 (2021)

    Article  MathSciNet  MATH  Google Scholar 

  56. Watanabe, S.: Mathematical theory of Bayesian statistics where all models are wrong. Advancements in Bayesian Methods and Implementations, Handbook of statistics, 47, 209-238 Elsevier, (2022)

  57. Watanabe, S.: Mathematical theory of Bayesian statistics for unknown information source. to appear in Philosophical Transactions of the Royal Society A, arXiv:2206.05630, (2022)

  58. Watanabe, T., Watanabe, S.: Asymptotic behavior of Bayesian generalization error in multinomial mixtures. arXiv:2203.06884

  59. Wei, S., Murfet, D., Gong, M., Li, H., Gell-Redman, J., Quella, T.: Deep learning is singular, and That’s good. IEEE Trans. Neural Netw. Learn. Syst. 33, 1–14 (2022)

    Google Scholar 

  60. Yamazaki, K., Watanabe, S.: Singularities in mixture models and upper bounds of stochastic complexity. Int. J. Neural Netw. 16(7), 1029–1038 (2003)

    Article  MATH  Google Scholar 

  61. Yamazaki, K., Watanabe, S.: Algebraic geometry and stochastic complexity of hidden Markov models. Neurocomputing 69, 62–84 (2005)

    Article  Google Scholar 

  62. Yamazaki, K., Watanabe, S.: Singularities in complete bipartite graph-type boltzmann machines and upper bounds of stochastic complexities. IEEE Trans. Neural Netw. 16, 312–324 (2005)

    Article  Google Scholar 

  63. Yamazaki, K., Kawanabe, M., Watanabe, S., Sugiyama, M., Müller, K.-R.: Asymptotic bayesian generalization error when training and test distributions are different. Proceedings of the 24th international conference on Machine learning 1079-1086 (2007)

  64. Yamazaki, K., Aoyagi, M., Watanabe, S.: Asymptotic analysis of Bayesian generalization error with Newton diagram. Neural Netw. 23, 35–43 (2010)

    Article  MATH  Google Scholar 

  65. Yamazaki, K.: Asymptotic accuracy of Bayes estimation for latent variables with redundancy. Mach. Learn. 102, 1–28 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  66. Yamazaki, K., Kaji, D.: Comparing two Bayes methods based on the free energy functions in Bernoulli mixtures. Neural Netw. 44, 36–43 (2013)

    Article  MATH  Google Scholar 

  67. Zwiernik, P.: An asymptotic behavior of the marginal likelihood for general Markov models. J. Mach. Learn. Res. 12, 3283–3310 (2011)

    MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sumio Watanabe.

Ethics declarations

Conflict of interest

The author declares that he has no conflict of interest.

Additional information

Communicated by Noboru Murata.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Watanabe, S. Recent advances in algebraic geometry and Bayesian statistics. Info. Geo. (2022). https://doi.org/10.1007/s41884-022-00083-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s41884-022-00083-9

Keywords

  • Birational geometry
  • Resolution of singularities
  • Bayesian statistics
  • Real log canonical threshold