Abstract
Objective: There are multiple reasons for missing data in observational studies; excluding patients with missing data can lead to significant bias. In this study, we evaluated several methods for assigning missing values to health service utilisation.
Design and setting: Cancer of the Prostate Strategic Urologic Research Endeavor (CaPSURE) is a US national database of men with prostate cancer. Physician visits and diagnostic tests for 342 patients newly diagnosed with prostate cancer were evaluated.
Patients and participants: Patients were followed for a full year (observed data, n = 228) and patients with incom plete data (predicted data, n = 114) were included.
Interventions: We used the following approaches for imputing missing data: assigning the group mean, a time-specific mean, a patient-specific mean, a stratified mean (by age, localised disease and insurance status) and carrying the last observation forward and/or backward.
Main outcome measures and results: All prediction strategies resulted in higher estimates (19.3 to 23.1) for annual physician visits than was observed (17.1 ± 15.5), and differences were statistically significant for both the last observation carried forward (23.1 ± 15.5) and the patient’s individual mean (22.7 ± 36.1) when predicting physician visits. The same strategies had higher predicted values for x-rays (1.8 ± 5.1 and 1.8 ± 4.4 vs 1.1 ± 1.9 for the observed group), although the last observation carried forward was not statistically different from the observed value.
Conclusions: We were unable to identify a single optimal strategy. However, imputation from individual means and the last observation carried forward methods did not perform as well as the other strategies. While the differences observed in this study were small, we anticipate that with increased length of follow-up and more dropouts, there would be greater differences among strategies.
Similar content being viewed by others
References
Jarrett RG. The analysis of designed experiments with missing observations. Appl Stat 1978; 27: 38
Wilkinson GN. Estimation of missing values for the analysis of incomplete data. Biometrics 1958; 14: 257
Orchard T, Woodbury MA. A missing information principle: theory and applications. In: Woodbury MA, editor. Sixth Berkeley Symposium on Mathematical Statistics and Probability. Berkeley (CA): University of California Press, 1972
Little RJA, Rubin DB. On jointly estimating parameters and missing data by maximizing the complete-data likelihood. Am Stat 1983; 37: 218
Frane JW. Some simple procedures for handling missing data in multivariate analysis. Psychometrika 1976; 41: 409
Haitovsky Y. Missing data in regression analysis. J R Stat Soc B 1968; 30: 67–82
Kim JO, Curry J. The treatment of missing data in multivariate analysis. Soc Methods Res 1977; 6: 215–40
Beale EML, Little RJA. Missing values in multivariate analysis. J R Stat Soc 1975; 37: 129–45
Little RJA, Rubin DB. Statistical analysis with missing data 1989. New York: Wiley, 1989
Hsiao C. Analysis of panel data. New York (NY): Cambridge University Press, 1986
Leigh JP, Ward MM, Fries JF. Reducing attrition bias with an instrumental variable in a regression model: results from a panel of rheumatoid arthritis patients. Stat Med 1993; 12: 1005–18
Landis SH, Murray T, Bolden S, et al. Cancer statistics, 1998. CA Cancer J Clin 1998; 48 (1): 6–29
Lubke WL, Optenberg SA, Thompson JM. Analysis of the firstyear cost of a prostate cancer screening and treatment program in the United States. J Natl Cancer Inst 1994; 86: 1790–2
Litwin MS. Economic realities of prostate cancer management. Intern Med 1997; 5: 75–83
Lubeck DP, Litwin MS, Henning JM, et al. The CaPSURE database: a methodology for clinical practice and research in prostate cancer. Urology 1996; 48 (5): 773–7
Lubeck DP, Litwin MS, Henning JM, et al. Measurement of health-related quality of life in men with prostate cancer: data from the CaPSURE database. Qual Life Res 1997; 6 (5): 385–92
Albridge KM, Standish J, Fries JF. Hierarchical time-oriented approaches to missing data inference. Comp Biomed Res 1988; 21: 349–66
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Lubeck, D.P., Pasta, D.J., Flanders, S.C. et al. Approaches to Missing Data Inference Results from CaPSURE. Pharmacoeconomics 15, 197–204 (1999). https://doi.org/10.2165/00019053-199915020-00007
Published:
Issue Date:
DOI: https://doi.org/10.2165/00019053-199915020-00007