Skip to main content

Statistical Methods for Generalized Linear Models with Covariates Subject to Detection Limits

Abstract

Censored observations are a common occurrence in biomedical data sets. Although a large amount of research has been devoted to estimation and inference for data with censored responses, very little research has focused on proper statistical procedures when predictors are censored. In this paper, we consider statistical methods for dealing with multiple predictors subject to detection limits within the context of generalized linear models. We investigate and adapt several conventional methods and develop a new multiple imputation approach for analyzing data sets with predictors censored due to detection limits. We establish the consistency and asymptotic normality of the proposed multiple imputation estimator and suggest a computationally simple and consistent variance estimator. We also demonstrate that the conditional mean imputation method often leads to inconsistent estimates in generalized linear models, while several other methods are either computationally intensive or lead to parameter estimates that are biased or more variable compared to the proposed multiple imputation estimator. In an extensive simulation study, we assess the bias and variability of different approaches within the context of a logistic regression model and compare variance estimation methods for the proposed multiple imputation estimator. Lastly, we apply several methods to analyze the data set from a recently-conducted GenIMS study.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2

References

  1. Arunajadai SG, Rauh VA (2012) Handling covariates subject to limits of detection in regression. Environ Ecol Stat 157:369–391

    Article  MathSciNet  Google Scholar 

  2. Austin PC, Brunner LJ (2003) Type I error inflation in the presence of a ceiling effect. Am Stat 57:97–104

    Article  MATH  MathSciNet  Google Scholar 

  3. Austin PC, Hoch JS (2004) Estimating linear regression models in the presence of a censored independent variable. Stat Med 23:411–429

    Article  Google Scholar 

  4. Clayton DG (1978) A model for association in bivariate life tables and its application in epidemiological studies of familial tendency in chronic disease incidence. Biometrika 65:677–692

    Article  MathSciNet  Google Scholar 

  5. Firth D (1993) Bias reduction of maximum likelihood estimates. Biometrika 23:27–38

    Article  MathSciNet  Google Scholar 

  6. Giovanini J (2008) Generalized linear mixed models with censored covariates. PhD thesis, Oregon State University

  7. Helsel DR (2012) Statistics for censored environmental data using minitab and R. Wiley, New York

    MATH  Google Scholar 

  8. Herring AH, Ibrahim JG (2001) Likelihood-based methods for missing covariates in the Cox proportional hazards model. J Am Stat Assoc 96:292–302

    Article  MATH  MathSciNet  Google Scholar 

  9. Herring AH, Ibrahim JG, Lipsitz SR (2004) Non-ignorable missing covariate data in survival analysis: a case-study of an international breast cancer study group trial. J R Stat Soc, Ser C, Appl Stat 53:293–310

    Article  MATH  MathSciNet  Google Scholar 

  10. Hofert M, Kojadinovic I, Maechler M, Yan J (2012) Copula: Multivariate dependence with copulas. R package version 0.999-5. http://CRAN.R-project.org/package=copula

  11. Hornung RW, Reed LD (1990) Estimation of average concentration in the presence of nondetectable values. Appl Occup Environ Hyg 5:46–51

    Article  Google Scholar 

  12. Hughes JP (1999) Mixed effects models with censored data with application to HIV RNA levels. Biometrics 55:625–629

    Article  MATH  Google Scholar 

  13. Ibrahim JG, Chen MH, Lipsitz SR (2002) Bayesian methods for generalized linear models with covariates missing at random. Can J Stat 30:55–78

    Article  MATH  MathSciNet  Google Scholar 

  14. Ibrahim JG, Chen MH, Lipsitz SR, Herring AH (2005) Missing-data methods for generalized linear models: a comparative review. J Am Stat Assoc 100:332–346

    Article  MATH  MathSciNet  Google Scholar 

  15. Kellum JA, Kong L, Fink MP, Weissfeld LA, Yealy DM, Pinsky MR, Fine J, Krichevsky A, Delude R, Angus D (2007) Understanding the inflammatory cytokine response in pneumonia and sepsis. Arch Intern Med 167:1655–1663

    Article  Google Scholar 

  16. Lee M, Kong L, Weissfeld L (2012) Multiple imputation for left-censored biomarker data based on Gibbs sampling method. Stat Med 31:1838–1848

    Article  MathSciNet  Google Scholar 

  17. Lipsitz SR, Ibrahim JG, Chen MH, Peterson H (1999) Non-ignorable missing covariates in generalized linear models. Stat Med 18:2435–2448

    Article  Google Scholar 

  18. Little RJA (1992) Regression with missing x’s: a review. J Am Stat Assoc 87:1227–1237

    Google Scholar 

  19. Lubin JH, Colt JS, Camann D, Davis S, Cerhan JR, Severson RK, Bernstein L, Hartge P (2004) Epidemiologic evaluation of measurement data in the presence of detection limits. Environ Health Perspect 112:1691–1696

    Article  Google Scholar 

  20. Lyles RH, Lyles CM, Taylor DJ (2000) Random regression models for human immunodeficiency virus ribonucleic acid data subject to left censoring and informative drop-outs. J R Stat Soc, Ser C, Appl Stat 49:485–497

    Article  MATH  MathSciNet  Google Scholar 

  21. Lyles RH, Fan D, Chauchoowon R (2001) Correlation coefficient estimation involving a left censored laboratory assay variable. Stat Med 20:2921–2933

    Article  Google Scholar 

  22. Lynn HS (2001) Maximum likelihood inference for left-censored HIV RNA data. Stat Med 20:33–45

    Article  Google Scholar 

  23. May RC, Ibrahim JG, Chu H (2011) Maximum likelihood estimation in generalized linear models with multiple covariates subject to detection limits. Stat Med 30:2551–2561

    Article  MathSciNet  Google Scholar 

  24. Moulton LH, Halsey NA (1995) A mixture model with detection limits for regression analysis of antibody response to vaccine. Biometrics 51:1570–1578

    Article  MATH  Google Scholar 

  25. Nan B, Kalbfleisch JD, Yu M (2009) Asymptotic theory for the semiparametric accelerated failure time model with missing data. Ann Stat 37:2351–2376

    Article  MATH  MathSciNet  Google Scholar 

  26. Nie L, Chu H, Liu C, Cole SR, Vexler A, Schisterman EF (2010) Linear regression with an independent variable subject to a detection limit. Epidemiology 21S:S17–S24

    Article  Google Scholar 

  27. Paxton WB, Coombs RW, McElrath MJ, Keefer MC, Hughes J, Sinagil F, Chernoff D, Demeter L, Williams B, Corey L (1997) Longitudinal analysis of quantitative virologic measures in human immunodeficiency virus-infected subjects with ≥400 cd4 lymphocytes: implications for applying measurements to individual patients. J Infect Dis 175:247–254

    Article  Google Scholar 

  28. Pettitt AN (1986) Censored observations, repeated measures and mixed effects models—An approach using the em algorithm and normal errors. Biometrika 73:635–643

    Article  MATH  MathSciNet  Google Scholar 

  29. Piepho H-P, Thoni H, Müller H-M (2002) Estimating the product-moment correlation in samples with censoring on both variables. Biom J 44:657–670

    Article  MathSciNet  Google Scholar 

  30. Rigobon R, Stoker TM (2007) Estimation with censored regressors: basic issues. Int Econ Rev 48:1441–1467

    Article  MathSciNet  Google Scholar 

  31. Rigobon R, Stoker TM (2009) Bias from censored regressors. J Bus Econ Stat 27:340–353

    Article  MathSciNet  Google Scholar 

  32. Robins JM, Wang N (2000) Inference for imputation estimators. Biometrika 87:113–124

    Article  MATH  MathSciNet  Google Scholar 

  33. Rubin DB (1987) Multiple imputation for nonresponse. In: Surveys. Wiley, New York

    Google Scholar 

  34. Schaubel DE, Cai J (2006) Multiple imputation methods for recurrent event data with missing event category. Can J Stat 34:677–692

    Article  MATH  MathSciNet  Google Scholar 

  35. Thiebaut R, Jacqmin-Gadda H, Babiker A, Commenges D (The CASCADE Collaboration) (2005) Joint modelling of bivariate longitudinal data with informative dropout and left-censoring, with application to the evolution of cd4+ cell count and HIV RNA viral load in response to treatment of HIV infection. Stat Med 24:65–82

    Article  MathSciNet  Google Scholar 

  36. Tsiatis AA (2006) Semiparametric theory and missing data. Springer, Berlin

    MATH  Google Scholar 

  37. Tsimikas JV, Bantis LE, Georgiou SD (2012) Inference in generalized linear regression models with a censored covariate. Comput Stat Data Anal 56:1854–1868

    Article  MATH  MathSciNet  Google Scholar 

  38. Wang H, Feng X (2012) Multiple imputation for m-regression with censored covariates. J Am Stat Assoc 107:194–204

    Article  MATH  Google Scholar 

  39. Wang H, Fygenson M (2009) Inference for censored quantile regression models in longitudinal studies. Ann Stat 37:756–781

    Article  MATH  MathSciNet  Google Scholar 

  40. Wang N, Robins JM (1998) Large-sample theory for parametric multiple imputation procedures. Biometrika 84:935–948

    Article  MathSciNet  Google Scholar 

  41. Wu H, Chen Q, Ware LB, Koyama T (2012) A Bayesian approach for generalized linear models with explanatory biomarker measurement variables subject to detection limit—An application to acute lung injury. Appl Stat 39:1733–1747

    Article  MATH  MathSciNet  Google Scholar 

  42. Wu L (2002) A joint model for nonlinear mixed-effects models with censoring and covariates measured with error, with applications to aids studies. J Am Stat Assoc 97:955–964

    Article  MATH  Google Scholar 

Download references

Acknowledgements

Wang’s research was supported in part by NSF award DMS-1007420 and NSF CAREER award DMS-1149355 and Zhang’s research was supported in part by the HIH grant R01 CA85848-12 and the NIH/NIAID grant R37 AI031789-20.

The authors are grateful to the editor, an associate editor, and two anonymous referees for their valuable comments. The authors also would like to thank Dr. Lan Kong of Penn State College of Medicine and the CRISMA (Clinical Research, Investigation, and Systems Modeling of Acute Illness) Center at the University of Pittsburgh for providing the GenIMS data set.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Huixia J. Wang.

Electronic Supplementary Material

Below is the link to the electronic supplementary material.

(PDF 457 kB)

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Bernhardt, P.W., Wang, H.J. & Zhang, D. Statistical Methods for Generalized Linear Models with Covariates Subject to Detection Limits. Stat Biosci 7, 68–89 (2015). https://doi.org/10.1007/s12561-013-9099-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12561-013-9099-4

Keywords

  • Censored predictor
  • Complete case
  • Conditional mean imputation
  • Detection limit
  • Improper multiple imputation