Skip to main content

Advertisement

Log in

Model-based imputation of missing data from the 122 Cities Mortality Reporting System (122 CMRS)

Stochastic Environmental Research and Risk Assessment Aims and scope Submit manuscript

Abstract

National estimates of the all-cause and pneumonia and influenza (P&I) mortality burden derived from U.S. influenza surveillance data treat all missing or unreported values as zero counts. The effect of this methodological decision is to undercount influenza deaths, thus biasing estimates downward and producing underestimates of the true mortality burden. In this paper, a regression-based procedure is proposed to impute missing values and thus produce a more accurate estimate of mortality. Several model specifications are considered and evaluated to predict weekly death counts by city, calendar week, calendar year and age group. Revised all-cause, P&I and excess mortality estimates are calculated by imputing the missing data. The impact of the treatment of unreported mortality data on national estimates is evaluated by comparing the estimates obtained using data with and without imputation. This comparison reflects some differences in mortality burden, excess deaths, and trends over time. The model presented is a useful approach to impute missing counts and improve inference in situations with modest occurrence of missing data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3

References

  • Brammer L, Budd A, Cox N (2009) Seasonal and pandemic influenza surveillance considerations: mortality surveillance. Influenza Other Respir Viruses 3(2):51–58

    Article  Google Scholar 

  • Carpenter JR, Kenward MG, Vansteelandt S (2006) A comparison of multiple imputation and doubly robust estimation for analyses with missing data. J R Stat Soc 169:571–584

    Article  Google Scholar 

  • Cheng PY, Thompson WW, Dhara R, Ozonoff A, Miao X, Brammer L, Weintraub E, Blanton L, Shay DK (2009) Application of the robust regression models for estimating influenza-associated using the CDC 122 Cities Mortality Reporting System data. Proceedings of JSM, ASA Section of Statistics in Epidemiology

  • Choi KM, Yu HL, Wilson ML (2008) Spatiotemporal statistical analysis of influenza mortality risk in the State of California during the period 1997–2001. Stoch Environ Res Risk Assess 22(1):15–25

    Article  Google Scholar 

  • Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc 39:1–22

    Google Scholar 

  • Doshi P (2005) Are US flu death figures more PR than science? BMJ 331:1412

    Article  Google Scholar 

  • Doshi P (2008) Trends in recorded influenza mortality: United States, 1900–2004. Am J Public Health 98:939–945

    Article  Google Scholar 

  • Horton NJ, Kleinman KP (2007) Much ado about nothing: a comparison of missing data methods and software to fit incomplete data regression models. Am Stat 61:79–90

    Article  Google Scholar 

  • Kashani MH, Dinpashoh Y (2012) Evaluation of efficiency of different estimation methods for missing climatological data. Stoch Environ Res Risk Assess 26(1):59–71

    Article  Google Scholar 

  • Kim TW, Ahn H (2009) Spatial rainfall model using a pattern classifier for estimating missing daily rainfall data. Stoch Environ Res Risk Assess 23(3):367–376

    Article  Google Scholar 

  • Koch T, Denike K (2007) Certainty, uncertainty, and the spatiality of disease: a West Nile Virus example. Stoch Environ Res Risk Assess 21(5):523–531

    Article  Google Scholar 

  • Little RJA, Rubin DB (1987) Statistical analysis with missing data. Wiley, New York

    Google Scholar 

  • Martínez-Ruiz F, Mateu J, Montes F, Porcu E (2010) Mortality risk assessment through stationary spacetime covariance functions. Stoch Environ Res Risk Assess 24(4):519–526

    Article  Google Scholar 

  • Michel P, Wilson JB, Wayne Martin S, Clarke RC, McEwen SA, Gyles CL (2000) Estimation of the under-reporting rate for the surveillance of Escherichia coli O157:H7 cases in Ontario, Canada. Epidemiol Infect 125:35–45

    Article  CAS  Google Scholar 

  • Mikler AR, Venkatachalam S, Ramisetty-Mikler S (2007) Decisions under uncertainty: a computational framework for quantification of policies addressing infectious disease epidemics. Stoch Environ Res Risk Assess 21(5):533–543

    Article  Google Scholar 

  • Molinari NA, Ortega-Sanchez IR, Messonnier ML, Thompson WW, Wortley PM, Weintraub E, Bridges CB (2007) The annual impact of seasonal influenza in the US: measuring disease burden and costs. Vaccine 25(27):5086–5096

    Article  Google Scholar 

  • Muscatello DJ, Morton PM, Evans I, Gilmour R (2008) Prospective surveillance of excess mortality due to influenza in New South Wales: feasibility and statistical approach. Commun Dis Intell 32(4):435–442

    Google Scholar 

  • Newall AT, Viboud C, Wood JG (2010) Influenza-attributable mortality in Australians aged more than 50 years: a comparison of different modelling approaches. Epidemiol Infect 138(6):836–842

    Article  CAS  Google Scholar 

  • Ozonoff A, Sukpraprut S, Sebastiani P (2006) Modeling seasonality of influenza with Hidden Markov Models. Proceedings of the American Statistical Association, Section on Statistics in Defense and National Security

  • Rubin DB (1976) Inference and missing data. Biometrika 63:581–590

    Article  Google Scholar 

  • Ruiz-Medina MD, Espejo RM, Ugarte MD, Militino AF (2014) Functional time series analysis of spatio-temporal epidemiological data. Stoch Environ Res Risk Assess 28(4):943–954

    Article  Google Scholar 

  • Serfling RE (1963) Methods for current statistical analysis of excess pneumonia-influenza deaths. Public Health Rep 78:494–506

    Article  Google Scholar 

  • Thompson WW, Weintraub E, Dhankhar P, Cheng PY, Brammer L, Meltzer MI, Bresee JS, Shay DK (2009) Estimates of US influenza-associated deaths made using four different methods. Influenza Other Respir Viruses 3:37–49

    Article  Google Scholar 

  • U.S. Department of Health & Human Services (2004) 122 Cities Mortality Reporting System. Manual of Procedures. U.S. Department of Health & Human Services, Public Health Service, Atlanta

  • Walker N, Bryce J, Black RE (2007) Interpreting health statistics for policymaking: the story behind the headlines. Lancet 369:956–963

    Article  Google Scholar 

  • Wu MC, Lin GF, Lin HY (2013) The effect of data quality on model performance with application to daily evaporation estimation. Stoch Environ Res Risk Assess 27(7):1661–1671

    Article  Google Scholar 

Download references

Acknowledgments

PM would like to acknowledge the financial support of NIH/NIAID Grant R01AI097015 and “la Caixa” Foundation, Spain. AO acknowledges financial support for this research from CDC BioSense research Grant R01 PH000021-01.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Paula Moraga.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Moraga, P., Ozonoff, A. Model-based imputation of missing data from the 122 Cities Mortality Reporting System (122 CMRS). Stoch Environ Res Risk Assess 29, 1499–1507 (2015). https://doi.org/10.1007/s00477-014-0974-4

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00477-014-0974-4

Keywords

Navigation