Lifetime Data Analysis

, Volume 17, Issue 2, pp 256–279 | Cite as

Linear regression analysis of survival data with missing censoring indicators



Linear regression analysis has been studied extensively in a random censorship setting, but typically all of the censoring indicators are assumed to be observed. In this paper, we develop synthetic data methods for estimating regression parameters in a linear model when some censoring indicators are missing. We define estimators based on regression calibration, imputation, and inverse probability weighting techniques, and we prove all three estimators are asymptotically normal. The finite-sample performance of each estimator is evaluated via simulation. We illustrate our methods by assessing the effects of sex and age on the time to non-ambulatory progression for patients in a brain cancer clinical trial.


Asymptotic normality Censoring indicator Imputation Inverse probability weighting Least squares Missing at random Regression calibration 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Albert A, Anderson JA (1984) On the existence of maximum likelihood estimates in logistic regression models. Biometrika 71: 1–10MATHCrossRefMathSciNetGoogle Scholar
  2. Buckley J, James I (1979) Linear regression with censored data. Biometrika 66: 429–436MATHCrossRefGoogle Scholar
  3. Cox DR (1972) Regression models and life tables (with discussion). J R Stat Soc B 34: 187–220MATHGoogle Scholar
  4. Dewanji A (1992) A note on a test for competing risks with missing failure type. Biometrika 79: 855–857CrossRefGoogle Scholar
  5. Dikta G (1998) On semiparametric random censorship models. J Stat Plan Inference 66: 253–279MATHCrossRefMathSciNetGoogle Scholar
  6. Dinse GE (1982) Nonparametric estimation for partially-complete time and type of failure data. Biometrics 38: 417–431CrossRefGoogle Scholar
  7. Gao G, Tsiatis AA (2005) Semiparametric estimators for the regression coefficients in the linear transformation competing risks model with missing cause of failure. Biometrika 92: 875–891MATHCrossRefMathSciNetGoogle Scholar
  8. Goetghebeur EJ, Ryan L (1990) A modified logrank test for competing risks with missing failure type. Biometrika 77: 207–211MATHCrossRefMathSciNetGoogle Scholar
  9. Goetghebeur EJ, Ryan L (1995) Analysis of competing risks survival data when some failure types are missing. Biometrika 82: 821–833MATHCrossRefMathSciNetGoogle Scholar
  10. Hahn J (1998) On the role of the propensity score in efficient semiparametric estimation of average treatment effects. Econometrica 66: 315–331MATHCrossRefMathSciNetGoogle Scholar
  11. Hirano K, Imbens GW, Ridder G (2003) Efficient estimation of average treatment effects using the estimated propensity score. Econometrica 71: 1161–1189MATHCrossRefMathSciNetGoogle Scholar
  12. Horvitz DG, Thompson DJ (1952) A generalization of sampling without replacement from a finite universe. J Am Stat Assoc 47: 663–685MATHCrossRefMathSciNetGoogle Scholar
  13. Jin Z, Lin DY, Wei LJ, Ying Z (2003) Rank-based inference for the accelerated failure time model. Biometrika 90: 341–353MATHCrossRefMathSciNetGoogle Scholar
  14. Koul H, Susarla V, van Ryzin J (1981) Regression analysis with randomly right-censored data. Ann Stat 9: 1276–1288MATHCrossRefGoogle Scholar
  15. Lai TL, Ying Z (1991) Rank regression methods for left-truncated and right-censored data. Ann Stat 19: 531–556MATHCrossRefMathSciNetGoogle Scholar
  16. Leurgans S (1987) Linear models, random censoring and synthetic data. Biometrika 74: 301–309MATHCrossRefMathSciNetGoogle Scholar
  17. Li G, Wang QH (2003) Empirical likelihood regression analysis for right censored data. Stat Sinica 13: 51–68MATHGoogle Scholar
  18. Lo S-H (1991) Estimating a survival function with incomplete cause-of-death data. J Multivar Anal 39: 217–235MATHCrossRefGoogle Scholar
  19. Lu W, Liang Y (2008) Analysis of competing risks data with missing cause of failure under additive hazards model. Stat Sinica 18: 219–234MathSciNetGoogle Scholar
  20. Lu K, Tsiatis AA (2001) Multiple imputation methods for estimating regression coefficients in the competing risks model with missing cause of failure. Biometrics 57: 1191–1197CrossRefMathSciNetGoogle Scholar
  21. Lunceford JK, Davidian M (2004) Stratification and weighting via the propensity score in estimation of causal treatment effects: a comparative study. Stat Med 23: 2937–2960CrossRefGoogle Scholar
  22. McKeague IW, Subramanian S (1998) Product-limit estimators and Cox regression with missing censoring information. Scand J Stat 25: 589–601MATHCrossRefMathSciNetGoogle Scholar
  23. Miller RG (1976) Least squares regression with censored data. Biometrika 63: 449–464MATHCrossRefMathSciNetGoogle Scholar
  24. Peddada SD, Patwardhan G (1992) Jackknife variance estimators in linear models. Biometrika 79: 654–657MATHCrossRefMathSciNetGoogle Scholar
  25. Reid N (1994) A conversation with Sir David Cox. Stat Sci 9: 439–455MATHCrossRefGoogle Scholar
  26. Ritov Y (1990) Estimation in a linear regression model with censored data. Ann Stat 18: 303–328CrossRefMathSciNetGoogle Scholar
  27. Robins JM, Rotnitzky A, Zhao LP (1994) Estimation of regression coefficients when some regressors are not always observed. J Am Stat Assoc 89: 846–866MATHCrossRefMathSciNetGoogle Scholar
  28. Santner TJ, Duffy DE (1986) A note on A. Albert and J. A. Anderson’s conditions for the existence of maximum likelihood estimates in logistic regression models. Biometrika 73: 755–758MATHCrossRefMathSciNetGoogle Scholar
  29. Scharfstein DO, Rotnitzky A, Robins JM (1999) Adjusting for nonignorable drop-out using semiparametric nonresponse models (with discussion). J Am Stat Assoc 94: 1096–1146MATHCrossRefMathSciNetGoogle Scholar
  30. Subramanian S (2004) Asymptotically efficient estimation of a survival function in the missing censoring indicator model. Nonparametr Stat 16: 797–817MATHCrossRefMathSciNetGoogle Scholar
  31. Subramanian S (2006) Survival analysis for the missing censoring indicator model using kernel density estimation techniques. Stat Methodol 3: 125–136CrossRefMathSciNetGoogle Scholar
  32. Tsiatis AA (1990) Estimating regression parameters using linear rank tests for censored data. Ann Stat 18: 354–372MATHCrossRefMathSciNetGoogle Scholar
  33. Tsiatis AA, Davidian M, McNeney B (2002) Multiple imputation methods for testing treatment differences in survival distributions with missing cause of failure. Biometrika 89: 238–244MATHCrossRefMathSciNetGoogle Scholar
  34. Wang QH, Ng K (2008) Asymptotically efficient product-limit estimators with censoring indicators missing at random. Stat Sinica 18: 749–768MATHMathSciNetGoogle Scholar
  35. Wang QH, Linton O, Härdle W (2004) Semiparametric regression analysis with missing response at random. J Am Stat Assoc 99: 334–345MATHCrossRefGoogle Scholar
  36. Ying Z (1993) A large sample study of rank estimation for censored regression data. Ann Stat 21: 76–99MATHCrossRefGoogle Scholar
  37. Zhou X, Sun L (2003) Additive hazards regression with missing censoring information. Stat Sinica 13: 1237–1257MATHMathSciNetGoogle Scholar

Copyright information

© US Government  2010

Authors and Affiliations

  1. 1.Department of Mathematics and StatisticsYunnan UniversityKunmingChina
  2. 2.Academy of Mathematics and Systems ScienceChinese Academy of ScienceBeijingChina
  3. 3.Biostatistics BranchNational Institute of Environmental Health Sciences, Research Triangle ParkNorth CarolinaUSA

Personalised recommendations