Abstract
Survival data with missing censoring indicators are frequently encountered in biomedical studies. In this paper, we consider statistical inference for this type of data under the additive hazard model. Reweighting methods based on simple and augmented inverse probability are proposed. The asymptotic properties of the proposed estimators are established. Furthermore, we provide a numerical technique for checking adequacy of the fitted model with missing censoring indicators. Our simulation results show that the proposed estimators outperform the simple and augmented inverse probability weighted estimators without reweighting. The proposed methods are illustrated by analyzing a dataset from a breast cancer study.
Similar content being viewed by others
References
Breslow N, Day N (1980) Statistical models in cancer research, 1, the design and analysis of case-control studies, IARC, Lyon
Cummings F, Gray R, Davis T, Tormey D, Harris J, Falkson G, Arseneau J (1986) Tamoxifen versus placebo: double-blind adjuvant trial in elderly women with stage II breast cancer. NCI Monograph 1:119–123
Horvitz D, Thompson D (1952) A generalization of sampling without replacement from a finite universe. J Am Stat Assoc 47(260):663–685
Hyun S, Lee J, Sun Y (2012) Proportional hazards model for competing risks data with missing cause of failure. J Stat Plan Inference 142(7):1767–1779
Li X, Wang Q (2012) The weighted least square based estimators with censoring indicators missing at random. J Stat Plan Inference 142(11):2913–2925
Lin W (2011) Missing covariates and high dimensional variable selection in additive hazards regression. PhD thesis of University of Southern California
Lin D, Wei L, Ying Z (1993) Checking the Cox model with cumulative sums of martingale-based residuals. Biometrika 80(3):557–572
Lin D, Ying Z (1994) Semiparametric analysis of the additive risk model. Biometrika 81(1):61–71
Liu C, Wang Q (2010) Semiparametric estimation for regression coefficients in the Cox model with failure indicators missing at random. Stat Sinica 20(3):1125–1142
Lu K, Tsiatis A (2001) Multiple imputation methods for estimating regression coefficients in the competing risks model with missing cause of failure. Biometrics 57(4):1191–1197
Lu W, Liang Y (2008) Analysis of competing risks data with missing cause of failure under additive hazards model. Stat Sinica 18(1):219–234
McKeague I, Subramanian S (1998) Product-limit estimators and Cox regression with missing censoring information. Scand J Stat 25(4):589–601
Qi L, Wang C, Prentice R (2005) Weighted estimators for proportional hazards regression with missing covariates. J Am Stat Assoc 100(472):1250–1263
Robins J, Rotnitzky A, Zhao L (1994) Estimation of regression coefficients when some regressors are not always observed. J Am Stat Assoc 89(427):846–866
Song X, Sun L, Mu X, Dinse G (2010) Additive hazards regression with censoring indicators missing at random. Can J Stat 38(3):333–351
Sun Y, Wang H, Gilbert P (2012) Quantile regression for competing risks data with missing cause of failure. Stat Sinica 22(2):703–728
Tsiatis A (2006) Semiparametric theory and missing data. Springer, New York
Wang Q, Dinse G (2011) Linear regression analysis of survival data with missing censoring indicators. Lifetime Data Anal 17(2):256–279
White H (1982) Maximum likelihood estimation of misspecified models. Econometrica 50(1):1–25
Xu Q, Paik M, Luo X, Tsai W (2009) Reweighting estimators for Cox regression with missing covariates. J Am Stat Assoc 104(487):1155–1167
Zhou X, Sun L (2003) Additive hazards regression with missing censoring information. Stat Sinica 13(4):1237–1257
Acknowledgements
The authors thank Editor-in-Chief, the Associate Editor, and two referees for their insightful comments and suggestions that greatly improved the article. Funding was provided by National Natural Science Foundation of China (Grant Nos. 11201484, 11326184, 11501573, 61402534), Natural Science Foundation of Shandong Province of China (Grant No. ZR2015AL014), National Cancer Institute (Grant No. P01CA142538) and National Institute of Environmental Health Sciences (Grant No. R01ES021900).
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
1.1 Regularity conditions and sketch proofs of the main results
For proofs of the theorems, we list the following regularity conditions.
-
(C1):
\(\Lambda _{0}(\tau )<\infty \) and \(\mathrm {Pr}\{Y(\tau )=1\}>0\);
-
(C2):
Z is bounded with probability 1 and time-independent;
-
(C3):
The matrix A is positive definite.
-
(C4):
The observation probability \(\pi (W,\alpha )\) is bounded away from 0; \(\pi (W,\alpha )\) is twice continuously differentiable in \(\alpha \); There exists a compact neighborhood \(\mathcal {A}\) of \(\alpha _{0}\) such that \(E[\mathrm {sup}_{\alpha \in \mathcal {A}}\{\Vert \dot{\pi }(W,\alpha )\Vert ^{2} +\Vert \ddot{\pi }(W,\alpha )\Vert \}]<\infty \), where \(\dot{\pi }(W,\alpha )=\partial \pi (W,\alpha )/\partial \alpha \) and \(\ddot{\pi }(W,\alpha )=\partial ^{2} \pi (W,\alpha )/\partial \alpha \partial \alpha ^{T}\); There exists \(\alpha _{*}\) satisfying the equations \(E(S_{\alpha _{*}}^{*})=0\), where \(S_{\alpha }^{*}=(\xi -\pi (W,\alpha ))[\pi (W,\alpha )(1-\pi (W,\alpha ))]^{-1}\dot{\pi }(W,\alpha )\).
-
(C5):
\(\rho (W,\gamma )\) is twice continuously differentiable in \(\gamma \); There exists \(\gamma _{*}\) satisfying the equations \(E(S_{\gamma _{*}}^{*})=0\), where \(S_{\gamma }^{*}=\xi (\delta -\rho (W,\gamma ))[\rho (W,\gamma )(1-\rho (W,\gamma ))]^{-1}\dot{\rho }(W,\gamma )\).
All these conditions are standard for the derivation of asymptotic results in the survival analysis and parametric inference.
Proof of Theorem 1
By some simple algebraic calculations, it can be seen that
Under conditions (C1), (C2) and (C4), it can be shown that
By (A.1), (A.3) and the fact that \(\hat{\alpha }\xrightarrow {\mathrm {P}}\alpha _{0}\), we have
It is easy to see that
So we can conclude that
By the Taylor expansion of \(n^{\frac{1}{2}}U_{sr}(\beta _{0},\hat{\alpha })\) at \(\alpha _{0}\),
where
By (A.1) to (A.3) and the law of large numbers, it can be proven that
where
and
By (A.1), (A.3) and Lemma A.1 of Qi et al. (2005), we can obtain that
Define
and
which are score and information matrices of \(\pi (W,\alpha )\) respectively. Then under condition (C4), it can be shown that
where \(S_{\alpha ,i}\) is obtained through replacing \(\xi \) and W by \(\xi _{i}\) and \(W_{i}\) in \(S_{\alpha }\) respectively. By (A.4) to (A.8), we finally arrive at
By the central limit theorem, the desired result is proved. \(\square \)
Proof of Theorem 2
It is easily verified that
where
By the fact that
and (A.1), we can conclude that
By (A.11), we have
By the Taylor expansion of \(n^{\frac{1}{2}}U_{ar}(\beta _{0},\hat{\alpha },\hat{\gamma })\) at \(\alpha _{*}\) and \(\gamma _{*}\),
where
and
By (A.1), \(\hat{\alpha } \xrightarrow {\mathrm {P}}\alpha _{*}\) and \(\hat{\gamma } \xrightarrow {\mathrm {P}}\gamma _{*}\), we have
and
where
and
Similar to (A.7), we have
where
Define
and
Then under Condition (C4) and (C5), we have
and
where \(S_{\alpha _{*},i}^{*}\) and \(S_{\gamma _{*},i}^{*}\) are obtained through replacing \(\xi , \delta \) and W by \(\xi _{i}, \delta _{i}\) and \(W_{i}\) in \(S_{\alpha _{*}}^{*}\) and \(S_{\gamma _{*}}^{*}\). By (A.9) to (A.18), we can finally conclude that
By the central limit theorem, the desired result is proved. \(\square \)
Proof of (7)
It is easy to see that
By (A.1) and Taylor expansion of \(I\) at \(\alpha _{0}\), we obtain
where \(f_{4}(t,z,\alpha _{0})\) is the limit of \(\hat{f}_{4}(t,z,\hat{\alpha })\). \(\square \)
Similar to the proof of Theorem 2.4 in Lin (2011), by (4), it can be proven that
By (A.21) and Taylor expansion, we have
where \(f_{1}(t,z,\alpha _{0})\) and \(f_{3}(t,z,\alpha _{0})\) are the limits of \(\hat{f}_{1}(t,z,\hat{\alpha })\) and \(\hat{f}_{3}(t,z,\hat{\alpha })\) respectively. It is easy to see that
From (A.19) to (A.23), we finally arrive at
where \(f_{2}(t,z,\alpha _{0})\) is the limit of \(\hat{f}_{2}(t,z,\hat{\alpha })\). The finite dimensional convergence of \(\mathcal {F}(t,z)\) can be proven by the multivariate cental limit theorem. By the techniques in Lin (2011), it can be proven that \(\mathcal {F}(t,z)\) is tight. So \(\mathcal {F}(t,z)\) converges weakly to a zero-mean Gaussian process which can be approximately by (7).
Rights and permissions
About this article
Cite this article
Chen, X., Cai, J. Reweighted estimators for additive hazard model with censoring indicators missing at random. Lifetime Data Anal 24, 224–249 (2018). https://doi.org/10.1007/s10985-017-9398-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10985-017-9398-z
Keywords
- Additive hazard model
- Censored data
- Inverse probability weighted estimator
- Missing censoring indicators
- Reweighting