Skip to main content

The EM Algorithm

  • Chapter
  • First Online:

Part of the book series: Springer Handbooks of Computational Statistics ((SHCS))

Abstract

The Expectation-Maximization (EM) algorithm is a broadly applicable approach to the iterative computation of maximum likelihood estimates in a wide variety of incomplete-data problems. The EM algorithm has a number of desirable properties, such as its numerical stability, reliable global convergence, and simplicity of implementation. There are, however, two main drawbacks of the basic EM algorithm – lack of an in-built procedure to compute the covariance matrix of the parameter estimates and slow convergence. In addition, some complex problems lead to intractable Expectation-steps and Maximization-steps. The first edition of the book chapter published in 2004 covered the basic theoretical framework of the EM algorithm and discussed further extensions of the EM algorithm to handle complex problems. The second edition attempts to capture advanced developments in EM methodology in recent years, especially in its applications to the related fields of biomedical and health sciences.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   259.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   329.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   329.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  • Baker, S.G.: A simple method for computing the observed information matrix when using the EM algorithm with categorical data. J. Comput. Graph. Stat. 1, 63–76 (1992)

    Google Scholar 

  • Basford, K.E., Greenway, D.R., McLachlan, G.J., Peel, D.: Standard errors of fitted means under normal mixture models. Comput. Stat. 12, 1–17 (1997)

    MATH  Google Scholar 

  • Bishop, Y.M.M., Fienberg, S.E., Holland, P.W.: Discrete Multivariate Analysis: Theory and Practice. Springer, New York (2007)

    MATH  Google Scholar 

  • Booth, J.G., Hobert, J.P.: Maximizing generalized linear mixed model likelihoods with an automated Monte Carlo EM algorithm. J. Roy. Stat. Soc. B 61, 265–285 (1999)

    Article  MATH  Google Scholar 

  • Breslow, N.E., Clayton, D.G.: Approximate inference in generalized linear mixed models. J. Am. Stat. Assoc. 88, 9–25 (1993)

    Article  MATH  Google Scholar 

  • Casella, G., George, E.I.: Explaining the Gibbs sampler. Am. Stat. 46, 167–174 (1992)

    MathSciNet  Google Scholar 

  • Chen, K., Xu, L., Chi, H.: Improved learning algorithms for mixture of experts in multiclass classification. Neural Netw. 12, 1229–1252 (1999)

    Article  Google Scholar 

  • Chernick, M.R.: Bootstrap Methods: A Guide for Practitioners and Researchers. Wiley, Hoboken, New Jersey (2008)

    MATH  Google Scholar 

  • Cramér, H.: Mathematical Methods of Statistics. Princeton University Press, Princeton, New Jersey (1946)

    MATH  Google Scholar 

  • Csiszár, I., Tusnády, G.: Information geometry and alternating minimization procedure. In: Dudewicz, E.J., Plachky, D., Sen, P.K. (eds.) Recent Results in Estimation Theory and Related Topics, pp. 205–237. R. Oldenbourg, Munich (1984)

    Google Scholar 

  • Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. J. Roy. Stat. Soc. B 39, 1–38 (1977)

    MathSciNet  MATH  Google Scholar 

  • Efron, B.: Bootstrap methods: another look at the jackknife. Ann. Stat. 7, 1–26 (1979)

    Article  MathSciNet  MATH  Google Scholar 

  • Efron, B., Tibshirani, R.: An Introduction to the Bootstrap. Chapman & Hall, London (1993)

    MATH  Google Scholar 

  • Fessler, J.A., Hero, A.O.: Space-alternating generalized expectation-maximization algorithm. IEEE Trans. Signal. Process. 42, 2664–2677 (1994)

    Article  Google Scholar 

  • Flury, B., Zoppé, A.: Exercises in EM. Am. Stat. 54, 207–209 (2000)

    Google Scholar 

  • Gamerman, D., Lopes, H.F.: Markov Chain Monte Carlo: Stochastic Simulation for Bayesian Inference, 2nd edn. Chapman & Hall/CRC, Boca Raton, FL (2006)

    MATH  Google Scholar 

  • Gelfand, A.E., Smith, A.F.M.: Sampling-based approaches to calculating marginal densities. J. Am. Stat. Assoc. 85, 398–409 (1990)

    Article  MathSciNet  MATH  Google Scholar 

  • Hathaway, R.J.: Another interpretation of the EM algorithm for mixture distributions. Stat. Probab. Lett. 4, 53–56 (1986)

    Article  MathSciNet  MATH  Google Scholar 

  • Hastings, W.K.: Monte Carlo sampling methods using Markov chains and their applications. Biometrika 57, 97–109 (1970)

    Article  MATH  Google Scholar 

  • Jacobs, R.A., Jordan, M.I., Nowlan, S.J., Hinton, G.E.: Adaptive mixtures of local experts. Neural Comput. 3, 79–87 (1991)

    Article  Google Scholar 

  • Jamshidian, M., Jennrich, R.I.: Standard errors for EM estimation. J. Roy. Stat. Soc. B 62, 257–270 (2000)

    Article  MathSciNet  Google Scholar 

  • Jepson, A.D., Fleet, D.J., EI-Maraghi, T.F.: Robust online appearance models for visual tracking. IEEE Trans. Pattern Anal. Mach. Intell. 25, 1296–1311 (2003)

    Google Scholar 

  • Jordan, M.I., Jacobs, R.A.: Hierarchical mixtures of experts and the EM algorithm. Neural Comput. 6, 181–214 (1994)

    Article  Google Scholar 

  • Jordan, M.I., Xu, L.: Convergence results for the EM approach to mixtures of experts architectures. Neural Netw. 8, 1409–1431 (1995)

    Article  Google Scholar 

  • Lai, S.H., Fang, M.: An adaptive window width/center adjustment system with online training capabilities for MR images. Artif. Intell. Med. 33, 89–101 (2005)

    Article  Google Scholar 

  • Lee, M.L.T., Kuo, F.C., Whitmore, G.A., Sklar, J.: Importance of replication in microarray gene expression studies: statistical methods and evidence from repetitive cDNA hybridizations. Proc. Natl. Acad. Sci. USA 97, 9834–9838 (2000)

    Article  MATH  Google Scholar 

  • Levine, R., Fan, J.J.: An automated (Markov chain) Monte Carlo EM algorithm. J. Stat. Comput. Simulat. 74, 349–359 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  • Little, R.J.A., Rubin, D.B.: Statistical Analysis with Missing Data, 2nd edn. Wiley, New York (2002)

    MATH  Google Scholar 

  • Liu, C., Rubin, D.B.: The ECME algorithm: a simple extension of EM and ECM with faster monotone convergence. Biometrika 81, 633–648 (1994)

    Article  MathSciNet  MATH  Google Scholar 

  • Liu, C., Rubin, D.B.: Maximum likelihood estimation of factor analysis using the ECME algorithm with complete and incomplete data. Stat. Sin. 8, 729–747 (1998)

    MathSciNet  MATH  Google Scholar 

  • Liu, C., Rubin, D.B., Wu, Y.N.: Parameter expansion to accelerate EM: the PX–EM algorithm. Biometrika 85, 755–770 (1998)

    Article  MathSciNet  MATH  Google Scholar 

  • Louis, T.A.: Finding the observed information matrix when using the EM algorithm. J. Roy. Stat. Soc. B 44, 226–233 (1982)

    MathSciNet  MATH  Google Scholar 

  • McCullagh, P.A., Nelder, J.: Generalized Linear Models, 2nd edn. Chapman & Hall, London (1989)

    MATH  Google Scholar 

  • McCulloch, C.E.: Maximum likelihood algorithms for generalized linear mixed models. J. Am. Stat. Assoc. 92, 162–170 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  • McLachlan, G.J., Basford, K.E.: Mixture Models: Inference and Applications to Clustering. Marcel Dekker, New York (1988)

    MATH  Google Scholar 

  • McLachlan, G.J., Bean, R.W., Peel, D.: A mixture model-based approach to the clustering of microarray expression data. Bioinformatics 18, 413–422 (2002)

    Article  Google Scholar 

  • McLachlan, G.J., Do, K.A., Ambroise, C.: Analyzing Microarray Gene Expression Data. Wiley, New York (2004)

    MATH  Google Scholar 

  • McLachlan, G.J., Krishnan, T.: The EM Algorithm and Extensions, 2nd edn. Wiley, Hoboken, New Jersey (2008)

    Book  MATH  Google Scholar 

  • McLachlan, G.J., Peel, D.: Finite Mixture Models. Wiley, New York (2000)

    MATH  Google Scholar 

  • Meilijson, I.: A fast improvement of the EM algorithm in its own terms. J. Roy. Stat. Soc. B 51, 127–138 (1989)

    MathSciNet  MATH  Google Scholar 

  • Meng, X.L.: On the rate of convergence of the ECM algorithm. Ann. Stat. 22, 326–339 (1994)

    Article  MATH  Google Scholar 

  • Meng, X.L., Rubin, D.B.: Using EM to obtain asymptotic variance-covariance matrices: the SEM algorithm. J. Am. Stat. Assoc. 86, 899–909 (1991)

    Article  Google Scholar 

  • Meng, X.L., Rubin, D.B.: Maximum likelihood estimation via the ECM algorithm: a general framework. Biometrika 80, 267–278 (1993)

    Article  MathSciNet  MATH  Google Scholar 

  • Meng, X.L., van Dyk, D.: The EM algorithm – an old folk song sung to a fast new tune. J. Roy. Stat. Soc. B 59, 511–567 (1997)

    Article  MATH  Google Scholar 

  • Moore, A.W.: Very fast EM-based mixture model clustering using multiresolution kd-trees. In: Kearns, M.S., Solla, S.A., Cohn, D.A. (eds.) Advances in Neural Information Processing Systems 11, pp. 543–549. MIT Press, MA (1999)

    Google Scholar 

  • Neal, R.M., Hinton, G.E.: A view of the EM algorithm that justifies incremental, sparse, and other variants. In: Jordan, M.I. (ed.) Learning in Graphical Models, pp. 355–368. Kluwer, Dordrecht (1998)

    Chapter  Google Scholar 

  • Nettleton, D.: Convergence properties of the EM algorithm in constrained parameter spaces. Can. J. Stat. 27, 639–648 (1999)

    Article  MathSciNet  MATH  Google Scholar 

  • Ng, S.K., McLachlan, G.J.: On the choice of the number of blocks with the incremental EM algorithm for the fitting of normal mixtures. Stat. Comput. 13, 45–55 (2003)

    Article  MathSciNet  Google Scholar 

  • Ng, S.K., McLachlan, G.J (2004a). Using the EM algorithm to train neural networks: misconceptions and a new algorithm for multiclass classification. IEEE Trans. Neural Netw. 15, 738–749.

    Article  Google Scholar 

  • Ng, S.K., McLachlan, G.J (2004b). Speeding up the EM algorithm for mixture model-based segmentation of magnetic resonance images. Pattern Recogn. 37, 1573–1589.

    Article  MATH  Google Scholar 

  • Ng, S.K., McLachlan, G.J., Lee, A.H (2006a). An incremental EM-based learning approach for on-line prediction of hospital resource utilization. Artif. Intell. Med. 36, 257–267.

    Article  Google Scholar 

  • Ng, S.K., McLachlan, G.J., Wang, K., Ben-Tovim Jones, L., Ng, S.W (2006b). A mixture model with random-effects components for clustering correlated gene-expression profiles. Bioinformatics 22, 1745–1752.

    Article  Google Scholar 

  • Ng, S.K., McLachlan, G.J., Yau, K.K.W., Lee, A.H.: Modelling the distribution of ischaemic stroke-specific survival time using an EM-based mixture approach with random effects adjustment. Stat. Med. 23, 2729–2744 (2004)

    Article  Google Scholar 

  • Nikulin, V., McLachlan, G.J.: A gradient-based algorithm for matrix factorization applied to dimensionality reduction. In: Fred, A., Filipe, J., Gamboa, H. (eds.) Proceedings of BIOSTEC 2010, the 3rd International Joint Conference on Biomedical Engineering Systems and Technologies, pp. 147–152. Institute for Systems and Technologies of Information, Control and Communication, Portugal (2010)

    Google Scholar 

  • Pavlidis, P., Li, Q., Noble, W.S.: The effect of replication on gene expression microarray experiments. Bioinformatics 19, 1620–1627 (2003)

    Article  Google Scholar 

  • Pernkopf, F., Bouchaffra, D.: Genetic-based EM algorithm for learning Gaussian mixture models. IEEE Trans. Pattern Anal. Mach. Intell. 27, 1344–1348 (2005)

    Article  Google Scholar 

  • Pofahl, W.E., Walczak, S.M., Rhone, E., Izenberg, S.D.: Use of an artificial neural network to predict length of stay in acute pancreatitis. Am. Surg. 64, 868–872 (1998)

    Google Scholar 

  • Robert, C.P., Casella, G.: Monte Carlo Statistical Methods, 2nd edn. Springer, New York (2004)

    MATH  Google Scholar 

  • Roberts, G.O., Polson, N.G.: On the geometric convergence of the Gibbs sampler. J. Roy. Stat. Soc. B 56, 377–384 (1994)

    MathSciNet  MATH  Google Scholar 

  • Sahu, S.K., Roberts, G.O.: On convergence of the EM algorithm and the Gibbs sampler. Stat. Comput. 9, 55–64 (1999)

    Article  Google Scholar 

  • Sato, M., Ishii, S.: On-line EM algorithm for the normalized Gaussian network. Neural Comput. 12, 407–432 (2000)

    Article  Google Scholar 

  • Sexton, J., Swensen, A.R.: ECM algorithms that converge at the rate of EM. Biometrika 87, 651–662 (2000)

    Article  MathSciNet  MATH  Google Scholar 

  • Storey, J.D., Xiao, W., Leek, J.T., Tompkins, R.G., Davis, R.W.: Significance analysis of time course microarray experiments. Proc. Natl. Acad. Sci. USA 102, 12837–12842 (2005)

    Article  Google Scholar 

  • Titterington, D.M.: Recursive parameter estimation using incomplete data. J. Roy. Stat. Soc. B 46, 257–267 (1984)

    MathSciNet  MATH  Google Scholar 

  • Ueda, N., Nakano, R.: Deterministic annealing EM algorithm. Neural Netw. 11, 271–282 (1998)

    Google Scholar 

  • van Dyk, D.A., Tang, R.: The one-step-late PXEM algorithm. Stat. Comput. 13, 137–152 (2003)

    Google Scholar 

  • Vaida, F., Meng, X.L.: Two-slice EM algorithms for fitting generalized linear mixed models with binary response. Stat. Modelling 5, 229–242 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  • Wei, G.C.G., Tanner, M.A.: A Monte Carlo implementation of the EM algorithm and the poor man’s data augmentation algorithms. J. Am. Stat. Assoc. 85, 699–704 (1990)

    Article  Google Scholar 

  • Wright, K., Kennedy, W.J.: An interval analysis approach to the EM algorithm. J. Comput. Graph. Stat. 9, 303–318 (2000)

    MathSciNet  Google Scholar 

  • Wu, C.F.J.: On the convergence properties of the EM algorithm. Ann. Stat. 11, 95–103 (1983)

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shu Kay Ng .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Ng, S.K., Krishnan, T., McLachlan, G.J. (2012). The EM Algorithm. In: Gentle, J., Härdle, W., Mori, Y. (eds) Handbook of Computational Statistics. Springer Handbooks of Computational Statistics. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-21551-3_6

Download citation

Publish with us

Policies and ethics