Skip to main content

Part of the book series: Springer Optimization and Its Applications ((SOIA,volume 103))

  • 3419 Accesses

Abstract

In this chapter, we study parameter estimations from the viewpoint of optimization.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 16.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    In frequentist approach, the parameter \(\boldsymbol{\theta }\) is assumed to be constants and the MLE method estimates \(\boldsymbol{\theta }\) with certain confidence. While in Bayesian approach, the parameter \(\boldsymbol{\theta }\) can be further taken as random variables having certain a priori distributions. To distinguish these two approaches, we use the form \(f(\boldsymbol{x};\boldsymbol{\theta } )\) for frequentist approach (including MLE method) and the form \(f(\boldsymbol{x}\ \vert \ \boldsymbol{\theta })\) for Bayesian approach [1] throughout this book. If we assume \(\boldsymbol{\theta }\) follows uniform distributions in \(\boldsymbol{\varTheta }\) (i.e., \(f(\boldsymbol{\theta })\) are constants in \(\varTheta\); we do not have any special a priori information about \(\boldsymbol{\theta }\)), the likelihood function under Bayesian consideration should be proportional to the likelihood function under frequentist consideration, since \(\prod _{i=1}^{n}f(\boldsymbol{x}_{i};\boldsymbol{\theta } ) \propto \prod _{i=1}^{n}f(\boldsymbol{x}_{i}\ \vert \ \boldsymbol{\theta })f(\boldsymbol{\theta })\). So, MLE and Maximum A Posteriori (MAP) estimators coincide under such assumptions.

    It should be pointed out that some literatures use the form \(f(\boldsymbol{x}\ \vert \ \boldsymbol{\theta })\) for MLE method, too.

References

  1. Lee, P.M.: Bayesian Statistics: An Introduction, 4th edn. Wiley, Chichester (2012)

    Google Scholar 

  2. Lehmann, E.L., Casella, G.: Theory of Point Estimation, 2nd edn. Springer, New York (1998)

    MATH  Google Scholar 

  3. Casella, G., Berger, R.L.: Statistical Inference, 2nd edn. Duxbury and Thomson Learning, Pacific Grove (2002)

    Google Scholar 

  4. Efron, B.: Maximum likelihood and decision theory. Ann. Stat. 10(2), 340–356 (1982)

    Article  MATH  MathSciNet  Google Scholar 

  5. Saumard, A., Wellner, J.A.: Log-concavity and strong log-concavity: a review. Stat. Surv. 8, 45–114 (2014)

    Article  MATH  MathSciNet  Google Scholar 

  6. Wu, C.: On the convergence properties of the EM algorithm. Ann. Stat. 11(1), 95–103 (1983)

    Article  MATH  Google Scholar 

  7. Ma, J., Xu, L., Jordan, M.: Asymptotic convergence rate of the EM algorithm for Gaussian mixtures. Neural Comput. 12(12), 2881–2907 (2000)

    Article  Google Scholar 

  8. Neal, R.M., Hinton, G.E.: A view of the EM algorithm that justifies incremental, sparse, and other variants. In: Jordan, M.I. (ed.) Learning in Graphical Models, pp. 355–368. MIT, Cambridge (1999)

    Google Scholar 

  9. McLachlan, G., Krishnan, T.: The EM Algorithm and Extensions, 2nd edn. Wiley, Hoboken (2008)

    Book  MATH  Google Scholar 

  10. Gupta, M.R., Chen, Y.: Theory and use of the EM algorithm. Found. Trends ®; Signal Process. 4(3), 223–296 (2010)

    Google Scholar 

  11. Hartley, H.: Maximum likelihood estimation from incomplete data. Biometrics 14(2), 174–194 (1958)

    Article  MATH  Google Scholar 

  12. Dempster, A., Laird, N., Rubin, D.: Maximum likelihood from incomplete data via the EM Algorithm. J. R. Stat. Soc. Ser. B Methodol. 39(1), 1–38 (1977)

    MATH  MathSciNet  Google Scholar 

  13. Roweis, S.: EM Algorithms for PCA and SPCA. In: Proceedings of the 1997 Conference on Advances in Neural Information Processing Systems, vol. 10, pp. 626–632 (1998), Denver, CO, USA

    Google Scholar 

  14. Tipping, M.E., Bishop, C.M.: Probabilistic principal compnent analysis. J. R. Stat. Soc. Ser. B 21(3), 611–622 (1999)

    Article  MathSciNet  Google Scholar 

  15. Jain, A.K.: Data clustering: 50 years beyond K-means. Pattern Recognit. Lett. 31(8), 651–666 (2010)

    Article  Google Scholar 

  16. Arthur, D., Vassilvitskii, S.: How slow is the k-means method? In: Proceedings of the 32 Annual Symposium on Computational Geometry, Sedona, pp. 144–153. ACM (2006)

    Google Scholar 

  17. Bhat, B.R.: Maximum likelihood estimation for positively regular Markov chains. Sankhyā: Indian J. Stat. 22(3–4), 339–344 (1960)

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Tsinghua University Press, Beijing and Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Li, L. (2015). Parameter Estimations. In: Selected Applications of Convex Optimization. Springer Optimization and Its Applications, vol 103. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-46356-7_3

Download citation

Publish with us

Policies and ethics