Computational Statistics

, Volume 31, Issue 4, pp 1513–1538 | Cite as

Stochastic EM algorithms for parametric and semiparametric mixture models for right-censored lifetime data

  • Laurent BordesEmail author
  • Didier Chauveau
Original Paper


Mixture models in reliability bring a useful compromise between parametric and nonparametric models, when several failure modes are suspected. The classical methods for estimation in mixture models rarely handle the additional difficulty coming from the fact that lifetime data are often censored, in a deterministic or random way. We present in this paper several iterative methods based on EM and Stochastic EM methodologies, that allow us to estimate parametric or semiparametric mixture models for randomly right censored lifetime data, provided they are identifiable. We consider different levels of completion for the (incomplete) observed data, and provide genuine or EM-like algorithms for several situations. In particular, we show that simulating the missing data coming from the mixture allows to plug a standard R package for survival data analysis in an EM algorithm’s M-step. Moreover, in censored semiparametric situations, a stochastic step is the only practical solution allowing computation of nonparametric estimates of the unknown survival function. The effectiveness of the new proposed algorithms are demonstrated in simulation studies and an actual dataset example from aeronautic industry.


Censored data Stochastic EM algorithm Finite mixture Reliability Semiparametric mixtures Survival data 


  1. Andersen P, Borgan O, Gill R, Keiding N (1993) Statistical models based on counting processes. Springer, New YorkCrossRefzbMATHGoogle Scholar
  2. Atkinson SE (1992) The performance of standard and hybrid EM algorithms for ML estimates of the normal mixture model with censoring. J Stat Comput Simul 44(1–2):105–115CrossRefGoogle Scholar
  3. Balakrishnan N, Mitra D (2011) Likelihood inference for lognormal data with left truncation and right censoring with illustration. J Stat Plan Inference 144(11):3536–3553MathSciNetCrossRefzbMATHGoogle Scholar
  4. Balakrishnan N, Mitra D (2014) EM-based likelihood inference for some lifetime distributions based on left truncated and right censored data and associated model discrimination. S Afr Stat J 48:125–171MathSciNetGoogle Scholar
  5. Benaglia T, Chauveau D, Hunter DR (2009a) An EM-like algorithm for semi-and non-parametric estimation in multivariate mixtures. J Comput Graph Stat 18(2):505–526MathSciNetCrossRefGoogle Scholar
  6. Benaglia T, Chauveau D, Hunter DR, Young D (2009b) mixtools: an R package for analyzing finite mixture models. J Stat Softw 32(6):1–29CrossRefGoogle Scholar
  7. Beutner E, Bordes L (2011) Estimators based on data-driven generalized weighted Cramer-von Mises distances under censoring-with applications to mixture models. Scand J Stat 38(1):108–129MathSciNetCrossRefzbMATHGoogle Scholar
  8. Bordes L, Chauveau D (2014) Comments: EM-based likelihood inference for some lifetime distributions based on left truncated and right censored data and associated model discrimination. S Afr Stat J 48:197–200MathSciNetGoogle Scholar
  9. Bordes L, Chauveau D, Vandekerkhove P (2007) A stochastic EM algorithm for a semiparametric mixture model. Comput Stat Data Anal 51(11):5429–5443MathSciNetCrossRefzbMATHGoogle Scholar
  10. Bordes L, Mottelet S, Vandekerkhove P (2006) Semiparametric estimation of a two-component mixture model. Ann Stat 34(3):1204–1232MathSciNetCrossRefzbMATHGoogle Scholar
  11. Cao R, Janssen P, Veraverbeke N (2001) Relative density estimation and local bandwidth selection for censored data. Comput Stat Data Anal 36(4):497–510MathSciNetCrossRefzbMATHGoogle Scholar
  12. Castet J-F, Saleh JH (2010) Single versus mixture weibull distributions for nonparametric satellite reliability. Reliab Eng Syst Saf 95:295–300CrossRefGoogle Scholar
  13. Cavanaugh JE, Shumway RH (1998) An Akaike information criterion for model selection in the presence of incomplete data. J Stat Plan Inference 67(1):45–65MathSciNetCrossRefzbMATHGoogle Scholar
  14. Celeux G, Chauveau D, Diebolt J (1996) Stochastic versions of the EM algorithm: an experimental study in the mixture case. J Stat Comput Simul 55:287–314CrossRefzbMATHGoogle Scholar
  15. Celeux G, Diebolt J (1986) The SEM algorithm: a probabilistic teacher algorithm derived from the EM algorithm for the mixture problem. Comput Stat Q 2:73–82Google Scholar
  16. Chauveau D (1995) A stochastic EM algorithm for mixtures with censored data. J Stat Plan Inference 46(1):1–25MathSciNetCrossRefzbMATHGoogle Scholar
  17. Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc Ser B (Methodological) 39(1):1–38MathSciNetzbMATHGoogle Scholar
  18. Dirick L, Claeskens G, Baesens B (2015) An Akaike information criterion for multiple event mixture cure models. Eur J Oper Res 241:449–457MathSciNetCrossRefzbMATHGoogle Scholar
  19. Dubos GF, Castet J-F, Saleh JH (2010) Statistical reliability analysis of satellites by mass category: Does spacecraft size matter? Acta Astronaut 67:584–595CrossRefGoogle Scholar
  20. Hunter DR, Wang S, Hettmansperger TP (2007) Inference for mixtures of symmetric distributions. Ann Stat 35(1):224–251MathSciNetCrossRefzbMATHGoogle Scholar
  21. Karunamuni R, Wu J (2009) Minimum hellinger distance estimation in a nonparametric mixture model. J Stat Plan Inference 3:1118–1133MathSciNetCrossRefzbMATHGoogle Scholar
  22. Lee G, Scott C (2012) EM algorithms for multivariate gaussian mixture models with truncated and censored data. Comput Stat Data Anal 56:2816–2829MathSciNetCrossRefzbMATHGoogle Scholar
  23. Louis T (1982) Finding the observed information matrix when using the em algorithm. J R Stat Soc Ser B 44:226–233MathSciNetzbMATHGoogle Scholar
  24. McLachlan G, Peel D (2000) Finite mixture models: Wiley series in probability and statistics: applied probability and statistics. Wiley-Interscience, New YorkzbMATHGoogle Scholar
  25. McLachlan GJ, Krishnan T (2008) The EM algorithm and extensions: Wiley series in probability and statistics: applied probability and statistics. Wiley-Interscience, New YorkGoogle Scholar
  26. Nielsen SF (2000) The stochastic EM algorithm: estimation and asymptotic results. Bernoulli 6(3):457–489MathSciNetCrossRefzbMATHGoogle Scholar
  27. R Core Team (2013) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, AustriaGoogle Scholar
  28. Suzukawa A, Imai H, Sato Y (2001) Kullback–Leibler information consistent estimation for censored data. Ann Inst Stat Math 53(2):262–276MathSciNetCrossRefzbMATHGoogle Scholar
  29. Svensson I, Sjöstedt-de Luna S (2010) Asymptotic properties of a stochastic EM algorithm for mixtures with censored data. J Stat Plan Inference 140:111–127MathSciNetCrossRefzbMATHGoogle Scholar
  30. Therneau T, Lumley T (2009) survival: Survival analysis, including penalised likelihood. R package version 2.35-8Google Scholar
  31. Wei G, Tanner M (1990) A Monte Carlo implementation of the EM algorithm and the poor man’s data augmentation algorithm. J Am Stat Assoc 85:699–704CrossRefGoogle Scholar
  32. Yu H (2012) Rmpi: Interface (Wrapper) to MPI (Message-Passing Interface)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2016

Authors and Affiliations

  1. 1.LMAP – UMR CNRS 5142Univ. Pau & Pays de l’AdourPauFrance
  2. 2.MAPMO – UMR CNRS 7349Univ. d’OrléansOrléansFrance

Personalised recommendations