Missing data methods for dealing with missing items in quality of life questionnaires. A comparison by simulation of personal mean score, full information maximum likelihood, multiple imputation, and hot deck techniques applied to the SF-36 in the French 2003 decennial health survey
- 1.6k Downloads
Missing items are common in quality of life (QoL) questionnaires and present a challenge for research in this field. It remains unclear which of the various methods proposed to deal with missing data performs best in this context. We compared personal mean score, full information maximum likelihood, multiple imputation, and hot deck techniques using various realistic simulation scenarios of item missingness in QoL questionnaires constructed within the framework of classical test theory.
Samples of 300 and 1,000 subjects were randomly drawn from the 2003 INSEE Decennial Health Survey (of 23,018 subjects representative of the French population and having completed the SF-36) and various patterns of missing data were generated according to three different item non-response rates (3, 6, and 9%) and three types of missing data (Little and Rubin’s “missing completely at random,” “missing at random,” and “missing not at random”). The missing data methods were evaluated in terms of accuracy and precision for the analysis of one descriptive and one association parameter for three different scales of the SF-36.
For all item non-response rates and types of missing data, multiple imputation and full information maximum likelihood appeared superior to the personal mean score and especially to hot deck in terms of accuracy and precision; however, the use of personal mean score was associated with insignificant bias (relative bias <2%) in all studied situations.
Whereas multiple imputation and full information maximum likelihood are confirmed as reference methods, the personal mean score appears nonetheless appropriate for dealing with items missing from completed SF-36 questionnaires in most situations of routine use. These results can reasonably be extended to other questionnaires constructed according to classical test theory.
KeywordsQuality of life SF-36 Missing data Imputation methods
Missing completely at random
Missing at random
Missing not at random
Personal mean score
Quality of life
Medical outcome study 36-item short-form health survey
Full information maximum likelihood
We thank Jean Louis Lanoë for allowing us to exploit the data of the 2003 Decennial Health Survey. We also thank David Jegou and Vivian Viallon for assistance with simulations.
- 7.Little, R., & Rubin, D. (1987). Statistical analysis with missing data. New York: John Wiley and Sons.Google Scholar
- 9.Fielding, S., Fayers, P. M., & Ramsay, C. R. (2009). Investigating the missing data mechanism in quality of life outcomes: A comparison of approaches. Health and Quality of Life Outcomes, 8(16), 1477–7525.Google Scholar
- 20.Sande, I. (1983). Hot Deck imputation procedures, incomplete data in samples surveys. New York: Academic Press. (Book).Google Scholar
- 24.Lanoë, J., Makdessi-Raynaud, Y. (2005). L’état de santé en France en 2003. Santé perçue, morbidité déclarée et recours aux soins à travers l’enquête décennale santé. Etudes et résultats (DRESS), 436,1–12.Google Scholar
- 26.Stewart, A. L., & Ware, J. E. (1992). Measuring functioning and well-being. Durham and London: D.U. Press. (Book).Google Scholar
- 27.Leplege, A., Ecosse, E., Verdier, A., Perneger, T. V. (1998). The French SF-36 Health Survey: Translation, cultural adaptation and preliminary psychometric evaluation. Journal of Clinical Epidemiology, 51(11), 1013–23.Google Scholar
- 28.Leplege, A., Ecosse, E., Pouchot, J., Coste, J., & Perneger, T. V. (2001). Le questionnaire MOS SF-36, manuel de l’utilisation et guide d’interprétation des scores. Paris: ESTEM. (Book).Google Scholar
- 32.Gandek, B., Ware, J. E., Aaronson, N. K., Alonso, J., Apolone, G., Bjorner, J., et al. (1998). Tests of data duality, scaling assumptions, and reliability of the SF-36 in Eleven Countries: Results from the IQOLA Project. Journal of Clinical Epidemiology, 51(11), 1149–1158.CrossRefPubMedGoogle Scholar
- 35.Vermunt, J. K., Van Ginkel, J. R., Van der Ark, L. A., & Sijtsma, K. (2008). Multiple imputation of incomplete categorical data using latent class analysis. Sociological Methodology, 38, 369–397.Google Scholar
- 36.Reiter, J., Raghunathan, T. E., & Kinney, S. K. (2006). The importance of modeling the sampling design in multiple imputation for missing data. Survey Methodology, 32(2), 143–149.Google Scholar
- 37.Binder, D. A., Sun, W. (1996). Frequency valid multiple imputation for surveys with a complex design. Proceedings of the survey research methods section, ASA, 281–286.Google Scholar
- 38.Sterne, J. A. C., White, I. R., Carlin, J. B., Spratt, M., Royston, P., Kenward, M. G., et al. (2009). Multiple imputation for missing data in epidemiological and clinical research. Potential and Pitfalls BMJ, 338, b2393.Google Scholar