Skip to main content
Log in

Research note: imputing large group averages for missing data, using rural-urban continuum codes for density driven industry sectors

  • Published:
Journal of Population Research Aims and scope Submit manuscript

Abstract

Understanding the effects and consequences of missing data imputation is vital to the ability to obtain meaningful and reliable statistics and coefficients in the examination of any quantitatively-based phenomena. Over time a series of sophisticated methods have been developed to handle the issue of missing data imputation however, these sophisticated methods may not always be appropriate or attainable. In these specific cases more traditional approaches to missing data imputation must be employed and driven by the research project, theoretical framework, and the data. In this research note we offer a brief account of one such instance, implementing a large-group mean imputation approach to handling missing data. The analysis is drawn from a much larger project and shows the effect of proper group selection in terms of mean imputation using a cross-validation approach based on the imputed data’s relation to known values. Ultimately, the results show that the use of Rural-Urban Continuum codes are superior to currently used group-means in the U.S., thus introducing a new, and more efficient, approach to the handling of missing data using group-mean imputation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

Notes

  1. All nine Beale code categories were checked to make sure that a large enough n still existed in order to statistically draw a group average.

References

  • Afifi, A. A., & Elashoff, R. M. (1966). Missing observations in multivariate statistics: Review of the literature. Journal of the American Statistical Association, 61, 595–604.

    Article  Google Scholar 

  • Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm (with discussion). Journal of the Royal Statistical Society, B 39, 1–38.

    Google Scholar 

  • Economic Research Service (ERS). (2004). Measuring rurality: Rural-urban continuum codes. Retrieved April 28, 2004, from http://www.ers.usda.gov/Briefing/Rurality/RuralUrbCon/.

  • Gelman, A., King, G., & Liu, C. (1998). Not asked and not answered: Multiple imputation for multiple surveys. Journal of the American Statistical Association, 93, 846–874.

    Article  Google Scholar 

  • Hartley, H. O., & Hocking, R. R. (1971). The analysis of incomplete data. Biometrics, 27, 783–808.

    Article  Google Scholar 

  • Little, R. J. A., & Rubin, D. B. (1983). Incomplete data. Encyclopedia of Statistical Science, 4, 46–53.

    Google Scholar 

  • Myrtveit, I., Stensrud, E., & Olsson, U. H. (2001). Analyzing data sets with missing fata: An empirical evaluation of imputation methods and likelihood-based methods. IEEE Transactions on Software Engineering, 27, 999–1013.

    Article  Google Scholar 

Download references

Acknowledgments

The authors would like to acknowledge support for this project from the Social Science Research Center and Mississippi State University, through a master grant from the Highway Watch (HWW) in conjunction with the U.S. Department of Homeland Security (USDHS).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jeremy R. Porter.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Porter, J.R., Cossman, R.E. & James, W.L. Research note: imputing large group averages for missing data, using rural-urban continuum codes for density driven industry sectors. J Pop Research 26, 273–278 (2009). https://doi.org/10.1007/s12546-009-9018-1

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12546-009-9018-1

Keywords

Navigation