Computational Statistics

, Volume 18, Issue 1, pp 1–17 | Cite as

Fitting a Mixture Distribution to a Variable Subject to Heteroscedastie Measurement Errors

  • Markus Thamerus


In a structural errors-in-variables model the true regressors are treated as stochastic variables that can only be measured with an additional error. Therefore the distribution of the latent predictor variables and the distribution of the measurement errors play an important role in the analysis of such models. In this article the conventional assumptions of normality for these distributions are extended in two directions. The distribution of the true regressor variable is assumed to be a mixture of normal distributions and the measurement errors are again taken to be normally distributed but the error variances are allowed to be heteroscedastie. It is shown how an EM algorithm solely based on the error-prone observations of the latent variable can be used to find approximate ML estimates of the distribution parameters of the mixture. The procedure is illustrated by a Swiss data set that consists of regional radon measurements. The mean concentrations of the regions serve as proxies for the true regional averages of radon. The different variability of the measurements within the regions motivated this approach.


Heteroscedastie measurement errors Finite mixture distribution EM algorithm 



This research was partly supported by the Deutsche Forschungsgemeinschaft (German Research Council). I would like to thank Ch. E. Minder for discussion and introducing the problem. I would also like to thank an anonymous referee for directing my attention to some very general problems in the estimation of mixture models. Helpful discussions with H. Schneeweiss and R. Wolf are gratefully acknowledged.


  1. Böhning, D. (1999). Computer-Assisted Analysis of Mixtures and Applications; Meta-Analysis, Disease Mapping and Others. Chapman and Hall, London.zbMATHGoogle Scholar
  2. Bönning, D., Dietz, E., Schoub, R., Schlattman, P., and Lindsay, B.G. (1994). The distribution of the likelihood ratio for mixtures of densities from the one parameter exponential family. Annals of the Institute of Statistical Mathematics. 46, 373–388.CrossRefGoogle Scholar
  3. Carroll, R.J., Ruppert, D. and Stefanski, L.A. (1995). Measurement Error in Nonlinear Models. Chapman and Hall, London.CrossRefGoogle Scholar
  4. Caudill, S.B. and Acharya, R.N. (1998). Maximum likelihood estimation of a mixture of normal regressions: starting values and singularities. Communications in Statistics, B — Simulation and Computation. 27, 667–674.CrossRefGoogle Scholar
  5. Dempster, A.P., Laird, N.M. and Rubin, D.B. (1977). Maximum Likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society. B 39, 1–38.MathSciNetzbMATHGoogle Scholar
  6. Feng, Z.D. and McCulloch, C.E. (1994). On the likelihood ratio test statistic for the number of components in a normal mixture with unequal variances. Biometrics. 50, 1158–1162.CrossRefGoogle Scholar
  7. Hathaway, R.J. (1985). A constrained formulation of maximum-likelihood estimation for normal mixture distributions. Tie Annals of Statistics. 13, 795–800.MathSciNetCrossRefGoogle Scholar
  8. Hosmer Jr., D.W. (1973). On MLE of the parameters of a mixture of two normal distributions when the sample size is small. Communications in Statistics. 1, 217–227.MathSciNetCrossRefGoogle Scholar
  9. Kiefer, J. and Wolfowitz, J. (1956). Consistency of the maximum-likelihood estimation in the presence of infinitely many incidental parameters. Annals of Mathematical Statistics. 27, 887–906.MathSciNetCrossRefGoogle Scholar
  10. Kiefer, N.M. (1978). Discrete parameter variation: efficient estimation of a switching regression model. Econometrica. 46, 427–434.MathSciNetCrossRefGoogle Scholar
  11. Küchenhoff, H. and Carroll, R. J. (1997). Segmented regression with errors in predictors: semi-parametric and parametric methods. Statistics in Medicine. 16, 169–188.CrossRefGoogle Scholar
  12. Lindsay, B.G. (1995). Mixture Models: Theory, Geometry and Applications. NSF-CBMS Regional Conference Series in Probability and Statistics, Vol.5. Institute of Mathematical Statistics, Hayward, California.Google Scholar
  13. Louis, T. A. (1982). Finding the observed information matrix when using the EM algorithm. Journal of the Royal Statistical Society. B 44, 226–233.MathSciNetzbMATHGoogle Scholar
  14. Minder, Ch. E. and Völkle, H. (1995). Radon und Lungenkrebssterblichkeit in der Schweiz. 324. Bericht der mathematisch-statistischen Sektion der Forschungsgesellschaft Johanneum, 115–124.Google Scholar
  15. Pierce, D.A., Stram, D.O., Vaeth, M. und Schafer, D.W. (1992). The errors-in-variables problem: considerations provided by radiation dose-response analyses of the A-bomb survivor data. Journal of the American Statistical Association. 87, 351–359.CrossRefGoogle Scholar
  16. Quandt, R.E., and Ramsey, J.B. (1978). Estimating mixtures of normal distributions and switching regressions. Journal of the American Statistical Association. 73, 730–738.MathSciNetCrossRefGoogle Scholar
  17. Redner, R.A. and Walker, H.F. (1984). Mixture densities, maximum likelihood and the EM algorithm. SIAM Review. 26, 195–240.MathSciNetCrossRefGoogle Scholar
  18. Richardson, S. and Green, P.J. (1997). On Bayesian analysis of mixtures with an unknown number of components. Journal of the Royal Statistical Society. B 59, 731–792.MathSciNetCrossRefGoogle Scholar
  19. Roeder, K. and Wasserman, L. (1997). Practical Bayesian density estimation using mixtures of normals. Journal of the American Statistical Association. 92, 894–902.MathSciNetCrossRefGoogle Scholar
  20. Thamerus, M. (1998). Nichtlineare Regressionsmodelle mit heteroskedastischen Meßfehlern. Logos Verlag, Berlin.Google Scholar

Copyright information

© Physica-Verlag 2003

Authors and Affiliations

  • Markus Thamerus
    • 1
  1. 1.Institute of StatisticsUniversity of MunichMünchenGermany

Personalised recommendations