Quality and Quantity

, Volume 39, Issue 1, pp 19–36 | Cite as

Mixture Models of Missing Data

  • Tamás Rudas


This paper proposes a general framework for the analysis of survey data with missing observations. The approach presented here treats missing data as an unavoidable feature of any survey of the human population and aims at incorporating the unobserved part of the data into the analysis rather than trying to avoid it or make up for it. To handle coverage error and unit non-response, the true distribution is modeled as a mixture of an observable and of an unobservable component. Generally, for the unobserved component, its relative size (the no-observation rate) and its distribution are not known. It is assumed that the goal of the analysis is to assess the fit of a statistical model, and for this purpose the mixture index of fit is used. The mixture index of fit does not postulate that the statistical model of interest is able to account for the entire population rather, that it may only describe a fraction of it. This leads to another mixture representation of the true distribution, with one component from the statistical model of interest and another unrestricted one. Inference with respect to the fit of the model, with missing data taken into account, is obtained by equating these two mixtures and asking, for different no-observation rates, what is the largest fraction of the population where the statistical model may hold. A statistical model is deemed relevant for the population, if it may account for a large enough fraction of the population, assuming the true (if known) or a sufficiently small or a realistic no-observation rate.


missing data mixture index of fit model diagnostics no-fit rate no-observation rate 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Blau, P. M., Duncan, O. D. 1967The American Occupational StructureFree PressNew YorkGoogle Scholar
  2. Clogg, C. C., Rudas, T., Xi, L. 1995A new index of structure for the analysis of models for mobility tables and other cross classificationsSociological Methodology25197222Google Scholar
  3. Clogg, C. C., Rudas, T. & Matthews, S. (1998). Analysis of model misfit, structure, and local structure in contingency tables using graphical displays based on the mixture index of fit. In: M. Greenacre & J. Blasius (eds.), Visualization of Categorical Data. San Diego: Academic Press. pp. 425–439. Google Scholar
  4. Dayton, M. (2003). Applications and computational strategies for the two-point mixture index of fit. British Journal of Mathematical & Statistical Psychology. In press.Google Scholar
  5. ISSP (1995). See:\_service/issp/index.htm.Google Scholar
  6. Formann, A. K. 2000Rater agreement and the generalized Rudas–Clogg–Lindsay index of fitStatistics in Medicine.1918811888Google Scholar
  7. Formann, A. K. 2003Latent class model diagnostics–a review and some proposalsComputational Statistics and Data Analysis.19549559Google Scholar
  8. Knoke, D., Burke, P. J. 1980Log-Linear ModelsSageNewbury ParkGoogle Scholar
  9. Knott, M. 2001A Measure of Independence for a Multivariate Normal Distribution and Some Connections with Factor AnalysisDepartment of Statistics, London School of EconomicsResearch Report 63, LondonGoogle Scholar
  10. Little, R. J. A., Rubin, D. B. 1987Statistical Analysis with Missing DataWileyNew YorkGoogle Scholar
  11. MATLAB (2002). See: Scholar
  12. Rudas, T. (1998). Minimum mixture estimation and regression analysis. In B. Marx & H. Friedl (eds), Proceedings of the 13th International Workshop on Statistical Modeling, LA: Louisiana State University, pp. 340–347 Google Scholar
  13. Rudas, T. 1999The mixture index of fit and minimax regressionMetrika50163172Google Scholar
  14. Rudas, T. (1999). The mixture index of fit and minimax regression. Metrika 50: 163–172. Google Scholar
  15. Rudas, T., Clogg, C. C. & Lindsay, B. G. (1994). A new index of fit based on mixture methods for the analysis of contingency tables. Journal of the Royal Statistical Society, Series B 56:623–639. Google Scholar
  16. Rudas, T., Zwick, R. 1997Estimating the importance of differential item functioningJournal of Educational and Behavioral Statistics.223145Google Scholar
  17. Verdes, E. (2002). See: Scholar
  18. Xi, L. (1996). Measuring Goodness-of-fit In The Analysis of Contingency Tables with Mixture Based Indices: Algorithms, Asymptotics and Inference. Ph.D. Thesis, PA: Department of Statistics, The Pennsylvania State University. Google Scholar
  19. Xi, L, Lindsay, BG 1996A note on calculating the π* index of fit for the analysis of contingency tablesSociological Methods and Research.25248259Google Scholar

Copyright information

© Springer 2005

Authors and Affiliations

  1. 1.Department of Statistics, Faculty of Social SciencesEötvös Loránd UniversityBudapestHungary

Personalised recommendations