Lifetime Data Analysis

, Volume 22, Issue 4, pp 570–588 | Cite as

Cox regression with missing covariate data using a modified partial likelihood method

  • Torben Martinussen
  • Klaus K. Holst
  • Thomas H. Scheike


Missing covariate values is a common problem in survival analysis. In this paper we propose a novel method for the Cox regression model that is close to maximum likelihood but avoids the use of the EM-algorithm. It exploits that the observed hazard function is multiplicative in the baseline hazard function with the idea being to profile out this function before carrying out the estimation of the parameter of interest. In this step one uses a Breslow type estimator to estimate the cumulative baseline hazard function. We focus on the situation where the observed covariates are categorical which allows us to calculate estimators without having to assume anything about the distribution of the covariates. We show that the proposed estimator is consistent and asymptotically normal, and derive a consistent estimator of the variance–covariance matrix that does not involve any choice of a perturbation parameter. Moderate sample size performance of the estimators is investigated via simulation and by application to a real data example.


Cox model Missing covariate data Recursive estimation  Survival data 


  1. Andersen PK, Borgan Ø, Gill RD, Keiding N (1993) Statistical models based on counting processes. Springer-Verlag, New YorkCrossRefzbMATHGoogle Scholar
  2. Andersen PK, Gill RD (1982) Cox’s regression model for counting processes: a large sample study. Ann Stat 10:1100–1120MathSciNetCrossRefzbMATHGoogle Scholar
  3. Asgharian M (2014) On the singularities of the information matrix and multipath change-point problems. Theory Probab Appl 58:546–561MathSciNetCrossRefzbMATHGoogle Scholar
  4. Bagdonavicius V, Nikulin M (1999) Generalised proportional hazards model based on modified parital likelihood. Lifetime Data Anal 5:329–350MathSciNetCrossRefzbMATHGoogle Scholar
  5. Chen H (2002) Double-semiparametric method for missing covariates in Cox regression models. J Am Stat Assoc 97:565–576MathSciNetCrossRefzbMATHGoogle Scholar
  6. Chen H, Little R (1999) Proportional hazards regression with missing covariates. J Am Stat Assoc 94:896–908MathSciNetCrossRefzbMATHGoogle Scholar
  7. Cox DR (1972) Regression models and life-tables (with discussion). J R Stat Soc: Ser B 34:187–220Google Scholar
  8. Herring AH, Ibrahim JG (2001) Likelihood-based methods for missing covariates in the Cox proportional hazards model. J Am Stat Assoc 96:292–302MathSciNetCrossRefzbMATHGoogle Scholar
  9. Lin DY, Wei LJ, Ying Z (1993) Checking the Cox model with cumulative sums of martingale-based residuals. Biometrika 80:557–572MathSciNetCrossRefzbMATHGoogle Scholar
  10. Martinussen T (1999) Cox regression with incomplete covariate measurements using the EM-algorithm. Scand J Stat 26:479–491MathSciNetCrossRefzbMATHGoogle Scholar
  11. Martinussen T, Scheike TH (2006) Dynamic regression models for survival data. Springer-Verlag, New YorkzbMATHGoogle Scholar
  12. Pugh M, Robins J, Lipsitz S, Harrington D (1994) Inference in the Cox proportional hazards model with missing covariate data. Technical report, Harvard School og Public Health, Dept. of BiostatisticsGoogle Scholar
  13. Qi L, Wang CY, Prentice RL (2005) Weighted estimators for proportional hazards regression with missing covariates. J Am Stat Assoc 100:1250–1263MathSciNetCrossRefzbMATHGoogle Scholar
  14. Robins JM, Rotnitzky A, Zhao LP (1994) Estimation of regression coefficients when some regressors are not always observed. J Am Stat Assoc 89:846–866MathSciNetCrossRefzbMATHGoogle Scholar
  15. Sherman M (2006) Complex step derivatives: how did i miss this? Biomed Comput Rev 2(3):27Google Scholar
  16. Sterne JAC, White IR, Carlin JB, Spratt M, Royston P, Kenward MG, Wood AM, Carpenter JR (2009) Multiple imputation for missing data in epidemiological and clinical research: potential and pitfalls. BMJ 338:157–160CrossRefGoogle Scholar
  17. Tierney L, Kass RE, Kadane JB (1989) Fully exponential laplace approximations to expectations and variances of nonpositive functions. J Am Stat Assoc 84:710–716MathSciNetCrossRefzbMATHGoogle Scholar
  18. Wang CY, Chen HY (2001) Augmented inverse probability weighted estimator for Cox missing covariate regression. Biometrics 57:414–419MathSciNetCrossRefzbMATHGoogle Scholar
  19. White IR, Royston P (2009) Imputing missing covariate values for the Cox model. Stat Med 28:1982–98MathSciNetCrossRefGoogle Scholar
  20. Xu Q, Paik MC, Luo X, Tsai W-Y (2009) Reweighting estimators for Cox regression with missing covariates. J Am Stat Assoc 104:1155–1167MathSciNetCrossRefzbMATHGoogle Scholar
  21. Zucker D (2005) A pseudo partial likelihood method for semi-parametric survival regression with covariate errors. J Am Stat Assoc 100:1264–1277MathSciNetCrossRefzbMATHGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2015

Authors and Affiliations

  • Torben Martinussen
    • 1
  • Klaus K. Holst
    • 1
  • Thomas H. Scheike
    • 1
  1. 1.Department of BiostatisticsUniversity of CopenhagenCopenhagen KDenmark

Personalised recommendations