Skip to main content
Log in

Robust estimation for the order of finite mixture models

  • Published:
Metrika Aims and scope Submit manuscript

Abstract

In this paper, we study a robust and efficient estimation procedure for the order of finite mixture models based on the minimizing a penalized density power divergence estimator. For this task, we use the locally conic parametrization approach developed by Dacunha-Castelle and Gassiate (ESAIM Probab Stat 285–317, 1997a; Ann Stat 27:1178–1209, 1999), and verify that the minimizing a penalized density power divergence estimator is consistent. Simulation results are provided for illustration.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Akaike H (1973) Information theory and an extension of the maximum likelihood principle. In: Petrov BN, Csaki F (eds) Second international symposium on information theory, pp 267–281

  • Basu A, Harris IR, Hjort NL and Jones MC (1998). Robust and efficient estimation by minimizing a density power divergence. Biometrika 85: 549–559

    Article  MATH  MathSciNet  Google Scholar 

  • Beran R (1977). Minimum hellinger distance estimates for parametric models. Ann Stat 5: 445–463

    Article  MATH  MathSciNet  Google Scholar 

  • Chen J and Kalbfleisch JD (1996). Penalized minimum distance estimates in finite mixture models. Can J Stat 24: 167–175

    Article  MATH  MathSciNet  Google Scholar 

  • Cressie N and Read TRC (1984). Multinomial goodness-of-fit tests. J R Stat Soc B 5: 440–454

    MathSciNet  Google Scholar 

  • Csiszar I (1963) Eine Informationstheoretische Ungleichung und ihre Anwendung auf den Bewis der Ergodizitat on Markhoffschen Ketten. Publication of the Mathematical Institute of the Hungarian Academy of Sciences, vol 8, pp 84–108

  • Dacunha-Castelle D, Gassiate E (1997a) Testing in locally conic models and application to mixture models. ESAIM Probab Stat 285–317

  • Dacunha-Castelle D and Gassiate E (1997b). The estimation of the order of a mixture model. Bernoulli 3: 279–299

    Article  MATH  MathSciNet  Google Scholar 

  • Dacunha-Castelle D and Gassiate E (1999). Testing the order of a model using locally conic parametrization: population mixtures and stationary arma processes. Ann Stat 27: 1178–1209

    Article  MATH  Google Scholar 

  • Demsper AP, Laird NM and Rubin DB (1977). Large Maximum-likelihood from incomplete data via the EM algorithm. J R Stat Soc Ser B 39: 1–38

    Google Scholar 

  • Dudley RM and Philipp W (1983). Variance principles for sums of Banach space valued random elements of empirical processes. Z Warsch Verw Gebi 62: 509–552

    Article  MATH  MathSciNet  Google Scholar 

  • Fujisawa H and Eguchi S (2007). Robust estimation in the normal mixture model. J Stat Plan Inference 136: 3989–4011

    Article  MathSciNet  Google Scholar 

  • Furman WD and Lindsay BG (1994). Testing for the number of components in a mixture of normal distributions using moment estimators. Comput Stat Data Anal 17: 473–492

    Article  MATH  MathSciNet  Google Scholar 

  • Gassiat E (2002). Likelihood ratio inequalities with applications to various mixtures. Ann de l’ Inst Henri Poincaré 38: 897–906

    Article  MATH  MathSciNet  Google Scholar 

  • Heckman JJ, Robb R and Walker JR (1990). Testing the mixture of exponentials hypothesis and estimating the mixing distribution by the method of moments. J Am Stat Assoc 85: 582–589

    Article  MATH  Google Scholar 

  • Henna J (1985). On estimating the number of constituents of a finite mixture of continuous distributions. Ann Inst Stat Math 37: 235–240

    Article  MATH  MathSciNet  Google Scholar 

  • Hjort NL (1994) Minimum L2 and robust Kullback-Leibler estimation. In: Lachout P, Vis̃ek JÁ (eds) Proceedings of the 12th Praque conference on information theory, statistical decision functions and random processes. Prague Academy of Sciences of the Czech Republic, pp 102–105

  • Hong C and Kim Y (2001). Automatic selection of the tuning parameter in the minimum density power divergence estimation. J Korean Stat Soc 30: 453–465

    MathSciNet  Google Scholar 

  • Izenman AJ, Sommer C (1988) Philatic mixtures and multivariate densities. J Am Math Soc 83–94

  • James LF, Priebe CE and Marchette DJ (2001). Consistent estimation of mixture complexity. Ann Stat 29: 1281–1296

    Article  MATH  MathSciNet  Google Scholar 

  • Keribin C (2000). Consistent estimation of the order of mixture models. Sankhyā 62: 49–66

    MATH  MathSciNet  Google Scholar 

  • Leroux BG (1992). Consistent estimation of a mixing distribution. Ann Stat 20: 1350–1360

    Article  MATH  MathSciNet  Google Scholar 

  • Lindsay BG (1983). Moment matrices: application in mixtures. Ann Stat 17: 722–740

    Article  MathSciNet  Google Scholar 

  • Liu X and Shao Y (2003). Asymptotics for likelihood ratio tests under loss of identifiability. 31: 807–832

  • Mattheou K, Lee S, Karagrigourio A (2008) The BHHJ measure: properties, characterizations, goodness of fit tests and model selection. J Stat Plan Inference (accepted)

  • Pardo L (2006). Statistical inference based on divergence measures. Chapman and Hall/CRC, London/Boca Raton

    MATH  Google Scholar 

  • Priebe CE and Marchette DJ (2000). Alternating kernel and mixture density estimates. Comput Stat Data Anal 35: 43–65

    Article  MATH  MathSciNet  Google Scholar 

  • Redner R and Walker HF (1984). Mixture densities, maximum likelihood and the EM algorithm. SIAM Rev 26: 195–239

    Article  MATH  MathSciNet  Google Scholar 

  • Roeder K (1994). A graphical technique for determining the number of components in a mixture of normals. J Am Stat Assoc 89: 487–495

    Article  MATH  MathSciNet  Google Scholar 

  • Schwarz G (1978). Estimating the dimension of a model. Ann Stat 6: 461–464

    Article  MATH  Google Scholar 

  • Scott DW (1998). Parametric modelling by minimum L2 error. TR 98-3. Rice University, Houston

    Google Scholar 

  • Scott DW (2001). Parametric statistical modelling by minimum integrated square error. Technometrics 43: 274–285

    Article  MathSciNet  Google Scholar 

  • Tamura R and Boos D (1986). Minimum hellinger distance estimation for multivariate location and covariance. J Am Stat Assoc 81: 223–229

    Article  MATH  MathSciNet  Google Scholar 

  • Teicher H (1965). Identifiability of finite mixtures. Ann Math Stat 36: 423–439

    Article  Google Scholar 

  • Terrell GR (1990) Linear density estimates. In: Proceeding in the statistical computing section. Am Stat Assoc, pp 297–302

  • Van der vart AW and Wellner JA (1996). Empirical processes. Springer, Berlin

    Google Scholar 

  • Warwick J and Jones MC (2005). Choosing a robustness tuning parameter. J Stat Comput Simul 75: 581–588

    Article  MATH  MathSciNet  Google Scholar 

  • Woodward WA, Parr WC, Schucany WR and Lindsay H (1984). A comparison of minimum distance and maximum likelihood estimation of a mixture proportion. J Am Stat Assoc 79: 590–598

    Article  MATH  Google Scholar 

  • Yakowitz SJ and Spragins JA (1968). On the identifiability of finite mixtures. Ann Math Stat 39: 209–214

    Article  MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Taewook Lee.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lee, S., Lee, T. Robust estimation for the order of finite mixture models. Metrika 68, 365–390 (2008). https://doi.org/10.1007/s00184-007-0168-x

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00184-007-0168-x

Keywords

Navigation