Advertisement

Maximum Likelihood Estimates for Gaussian Mixtures Are Transcendental

  • Carlos AméndolaEmail author
  • Mathias Drton
  • Bernd Sturmfels
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9582)

Abstract

Gaussian mixture models are central to classical statistics, widely used in the information sciences, and have a rich mathematical structure. We examine their maximum likelihood estimates through the lens of algebraic statistics. The MLE is not an algebraic function of the data, so there is no notion of ML degree for these models. The critical points of the likelihood function are transcendental, and there is no bound on their number, even for mixtures of two univariate Gaussians.

Keywords

Algebraic statistics Expectation maximization Maximum likelihood Mixture model Normal distribution Transcendence theory 

Notes

Acknowledgements

CA and BS were supported by the Einstein Foundation Berlin. MD and BS also thank the US National Science Foundation (DMS-1305154 and DMS-1419018).

References

  1. 1.
    Améndola, C., Faugère, J.-C., Sturmfels, B.: Moment varieties of Gaussian mixtures. J. Algebraic Stat. arXiv:1510.04654
  2. 2.
    Baker, A.: Transcendental Number Theory. Cambridge University Press, London (1975)CrossRefzbMATHGoogle Scholar
  3. 3.
    Belkin, M., Sinha, K.: Polynomial learning of distribution families. SIAM J. Comput. 44(4), 889–911 (2015)MathSciNetCrossRefzbMATHGoogle Scholar
  4. 4.
    Bishop, C.M.: Pattern Recognition and Machine Learning. Information Science and Statistics. Springer, New York (2006)zbMATHGoogle Scholar
  5. 5.
    Buot, M., Hoşten, S., Richards, D.: Counting and locating the solutions of polynomial systems of maximum likelihood equations. II. The Behrens-Fisher problem. Stat. Sin. 17(4), 1343–1354 (2007)MathSciNetzbMATHGoogle Scholar
  6. 6.
    Chang, E.-C., Choi, S.W., Kwon, D., Park, H., Yap, C.: Shortest paths for disc obstacles is computable. Int. J. Comput. Geom. Appl. 16, 567–590 (2006)MathSciNetCrossRefzbMATHGoogle Scholar
  7. 7.
    Choi, S.W., Pae, S., Park, H., Yap, C.: Decidability of collision between a helical motion and an algebraic motion. In: Hanrot, G., Zimmermann, P. (eds.) 7th Conference on Real Numbers and Computers, pp. 69–82. LORIA, Nancy (2006)Google Scholar
  8. 8.
    Drton, M., Sturmfels, B., Sullivant, S.: Lectures on Algebraic Statistics. Oberwolfach Seminars, vol. 39. Birkhäuser, Basel (2009)CrossRefzbMATHGoogle Scholar
  9. 9.
    Fraley, C., Raftery, A.E.: Enhanced model-based clustering, density estimation, and discriminant analysis software: MCLUST. J. Classif. 20, 263–286 (2003)MathSciNetCrossRefzbMATHGoogle Scholar
  10. 10.
    Ge, R., Huang, Q., Kakade, S.: Learning mixtures of Gaussians in high dimensions. In: STOC 2015, Proceedings of the Forty-Seventh Annual ACM on Symposium on Theory of Computing, pp. 761–770 (2015)Google Scholar
  11. 11.
    Gelfond, A.O.: Transcendental and Algebraic Numbers. Translated by Leo F. Boron, Dover Publications, New York (1960)Google Scholar
  12. 12.
    Gross, E., Drton, M., Petrović, S.: Maximum likelihood degree of variance component models. Electron. J. Stat. 6, 993–1016 (2012)MathSciNetCrossRefzbMATHGoogle Scholar
  13. 13.
    Huh, J., Sturmfels, B.: Likelihood geometry. In: Conca, A., et al. (eds.) Combinatorial Algebraic Geometry. Lecture Notes in Math., vol. 2108, pp. 63–117. Springer, Heidelberg (2014)Google Scholar
  14. 14.
    Moitra, A., Valiant, G.: Settling the polynomial learnability of mixtures of Gaussians. In: IEEE 51st Annual Symposium on Foundations of Computer Science, pp. 93–102 (2010)Google Scholar
  15. 15.
    Pearson, K.: Contributions to the mathematical theory of evolution. Philos. Trans. R. Soc. Lond. A 185, 71–110 (1894)CrossRefzbMATHGoogle Scholar
  16. 16.
    Redner, R.A., Walker, H.F.: Mixture densities, maximum likelihood and the EM algorithm. SIAM Rev. 26, 195–239 (1984)MathSciNetCrossRefzbMATHGoogle Scholar
  17. 17.
    Reeds, J.A.: Asymptotic number of roots of Cauchy location likelihood equations. Ann. Statist. 13(2), 775–784 (1985)MathSciNetCrossRefzbMATHGoogle Scholar
  18. 18.
    Srebro, N.: Are there local maxima in the infinite-sample likelihood of Gaussian mixture estimation? In: Bshouty, N.H., Gentile, C. (eds.) COLT. LNCS (LNAI), vol. 4539, pp. 628–629. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  19. 19.
    Sturmfels, B., Uhler, C.: Multivariate Gaussian, semidefinite matrix completion, and convex algebraic geometry. Ann. Inst. Statist. Math. 62(4), 603–638 (2010)MathSciNetCrossRefzbMATHGoogle Scholar
  20. 20.
    Teicher, H.: Identifiability of finite mixtures. Ann. Math. Stat. 34, 1265–1269 (1963)MathSciNetCrossRefzbMATHGoogle Scholar
  21. 21.
    Watanabe, S.: Algebraic Geometry and Statistical Learning Theory. Monographs on Applied and Computational Mathematics, vol. 25. Cambridge University Press, Cambridge (2009)CrossRefzbMATHGoogle Scholar
  22. 22.
    Watanabe, S., Yamazaki, K., Aoyagi, M.: Kullback information of normal mixture is not an analytic function. IEICE Technical report, NC2004-50 (2004)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • Carlos Améndola
    • 1
    Email author
  • Mathias Drton
    • 2
  • Bernd Sturmfels
    • 1
    • 3
  1. 1.Technische UniversitätBerlinGermany
  2. 2.University of WashingtonSeattleUSA
  3. 3.University of CaliforniaBerkeleyUSA

Personalised recommendations