Skip to main content
Log in

Semiparametric predictive mean matching

  • Original Paper
  • Published:
AStA Advances in Statistical Analysis Aims and scope Submit manuscript

Abstract

Predictive mean matching is an imputation method that combines parametric and nonparametric techniques. It imputes missing values by means of the Nearest Neighbor Donor with distance based on the expected values of the missing variables conditional on the observed covariates, instead of computing the distance directly on the values of the covariates. In ordinary predictive mean matching the expected values are computed through a linear regression model. In this paper a generalization of the original predictive mean matching is studied. Here the expected values used for computing the distance are estimated through an approach based on Gaussian mixture models. This approach includes as a special case the original predictive mean matching but allows one to deal also with nonlinear relationships among the variables. In order to assess its performance, an empirical evaluation based on simulations is carried out.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc. B 39, 1–38 (1977)

    MathSciNet  MATH  Google Scholar 

  • Di Zio, M., Guarnera, U., Luzi, O.: Imputation through finite mixture models. Comput. Stat. Data Anal. 51, 5305–5316 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  • Durrant, G.B., Skinner, C.: Using missing data methods to correct for measurement error in a distribution function. Surv. Methodol. 32, 25–36 (2006)

    Google Scholar 

  • Fraley, C., Raftery, E.: Model-based clustering, discriminant analysis, and density estimation. J. Am. Stat. Assoc. 97, 611–629 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  • Hunt, L., Jorgensen, M.: Mixture model clustering for mixed data with missing information. Comput. Stat. Data Anal. 41, 561–575 (2003)

    Article  MathSciNet  Google Scholar 

  • Kotz, S., Balakrishnan, N., Johnson, N.L.: Continuous Multivariate Distributions, vol. 1, 2nd edn. Wiley, New York (2000)

    MATH  Google Scholar 

  • Little, R.J.A.: Missing-data adjustments in large surveys. J. Bus. Econ. Stat. 6, 287–296 (1988)

    Article  Google Scholar 

  • Little, J., Rubin, D.: Statistical Analysis with Missing Data. Wiley, New York (2002)

    MATH  Google Scholar 

  • Marron, S., Wand, M.: Exact Mean Integrated Squared Error. Ann. Stat. 20, 712–736 (1992)

    Article  MathSciNet  MATH  Google Scholar 

  • McLachlan, G., Peel, D.: Finite Mixture Models. Wiley, New York (2000)

    Book  MATH  Google Scholar 

  • Roeder, K., Wasserman, L.: Practical density estimation using mixtures of normals. J. Am. Stat. Assoc. 92, 894–902 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  • Schafer, J.L.: Analysis of Incomplete Multivariate Data. Chapman & Hall, London (1997)

    MATH  Google Scholar 

  • Schwarz, G.: Estimating the dimension of a model. Ann. Stat. 6, 461–464 (1978)

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marco Di Zio.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Di Zio, M., Guarnera, U. Semiparametric predictive mean matching. AStA Adv Stat Anal 93, 175–186 (2009). https://doi.org/10.1007/s10182-008-0081-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10182-008-0081-2

Keywords

Navigation