Skip to main content
Log in

Approaches to Missing Data Inference Results from CaPSURE

An Observational Study of Patients with Prostate Cancer

  • Original Research Article
  • Missing Data Inference: Results from CaPSURE
  • Published:
PharmacoEconomics Aims and scope Submit manuscript

Abstract

Objective: There are multiple reasons for missing data in observational studies; excluding patients with missing data can lead to significant bias. In this study, we evaluated several methods for assigning missing values to health service utilisation.

Design and setting: Cancer of the Prostate Strategic Urologic Research Endeavor (CaPSURE) is a US national database of men with prostate cancer. Physician visits and diagnostic tests for 342 patients newly diagnosed with prostate cancer were evaluated.

Patients and participants: Patients were followed for a full year (observed data, n = 228) and patients with incom plete data (predicted data, n = 114) were included.

Interventions: We used the following approaches for imputing missing data: assigning the group mean, a time-specific mean, a patient-specific mean, a stratified mean (by age, localised disease and insurance status) and carrying the last observation forward and/or backward.

Main outcome measures and results: All prediction strategies resulted in higher estimates (19.3 to 23.1) for annual physician visits than was observed (17.1 ± 15.5), and differences were statistically significant for both the last observation carried forward (23.1 ± 15.5) and the patient’s individual mean (22.7 ± 36.1) when predicting physician visits. The same strategies had higher predicted values for x-rays (1.8 ± 5.1 and 1.8 ± 4.4 vs 1.1 ± 1.9 for the observed group), although the last observation carried forward was not statistically different from the observed value.

Conclusions: We were unable to identify a single optimal strategy. However, imputation from individual means and the last observation carried forward methods did not perform as well as the other strategies. While the differences observed in this study were small, we anticipate that with increased length of follow-up and more dropouts, there would be greater differences among strategies.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Jarrett RG. The analysis of designed experiments with missing observations. Appl Stat 1978; 27: 38

    Article  Google Scholar 

  2. Wilkinson GN. Estimation of missing values for the analysis of incomplete data. Biometrics 1958; 14: 257

    Article  Google Scholar 

  3. Orchard T, Woodbury MA. A missing information principle: theory and applications. In: Woodbury MA, editor. Sixth Berkeley Symposium on Mathematical Statistics and Probability. Berkeley (CA): University of California Press, 1972

    Google Scholar 

  4. Little RJA, Rubin DB. On jointly estimating parameters and missing data by maximizing the complete-data likelihood. Am Stat 1983; 37: 218

    Google Scholar 

  5. Frane JW. Some simple procedures for handling missing data in multivariate analysis. Psychometrika 1976; 41: 409

    Article  Google Scholar 

  6. Haitovsky Y. Missing data in regression analysis. J R Stat Soc B 1968; 30: 67–82

    Google Scholar 

  7. Kim JO, Curry J. The treatment of missing data in multivariate analysis. Soc Methods Res 1977; 6: 215–40

    Article  Google Scholar 

  8. Beale EML, Little RJA. Missing values in multivariate analysis. J R Stat Soc 1975; 37: 129–45

    Google Scholar 

  9. Little RJA, Rubin DB. Statistical analysis with missing data 1989. New York: Wiley, 1989

    Google Scholar 

  10. Hsiao C. Analysis of panel data. New York (NY): Cambridge University Press, 1986

    Google Scholar 

  11. Leigh JP, Ward MM, Fries JF. Reducing attrition bias with an instrumental variable in a regression model: results from a panel of rheumatoid arthritis patients. Stat Med 1993; 12: 1005–18

    Article  PubMed  CAS  Google Scholar 

  12. Landis SH, Murray T, Bolden S, et al. Cancer statistics, 1998. CA Cancer J Clin 1998; 48 (1): 6–29

    Article  PubMed  CAS  Google Scholar 

  13. Lubke WL, Optenberg SA, Thompson JM. Analysis of the firstyear cost of a prostate cancer screening and treatment program in the United States. J Natl Cancer Inst 1994; 86: 1790–2

    Article  PubMed  CAS  Google Scholar 

  14. Litwin MS. Economic realities of prostate cancer management. Intern Med 1997; 5: 75–83

    Google Scholar 

  15. Lubeck DP, Litwin MS, Henning JM, et al. The CaPSURE database: a methodology for clinical practice and research in prostate cancer. Urology 1996; 48 (5): 773–7

    Article  PubMed  CAS  Google Scholar 

  16. Lubeck DP, Litwin MS, Henning JM, et al. Measurement of health-related quality of life in men with prostate cancer: data from the CaPSURE database. Qual Life Res 1997; 6 (5): 385–92

    Article  PubMed  CAS  Google Scholar 

  17. Albridge KM, Standish J, Fries JF. Hierarchical time-oriented approaches to missing data inference. Comp Biomed Res 1988; 21: 349–66

    Article  CAS  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Deborah P. Lubeck.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lubeck, D.P., Pasta, D.J., Flanders, S.C. et al. Approaches to Missing Data Inference Results from CaPSURE. Pharmacoeconomics 15, 197–204 (1999). https://doi.org/10.2165/00019053-199915020-00007

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.2165/00019053-199915020-00007

Keywords

Navigation