Abstract
The additive hazards model is one of the most popular regression models for analyzing failure time data, especially when one is interested in the excess risk or risk difference. Although a couple of methods have been developed in the literature for regression analysis of interval-censored data, a general type of failure time data, they may be complicated or inefficient. Corresponding to this, we present a new maximum likelihood estimation procedure based on the sieve approach and in particular, develop an EM algorithm that involves a two-stage data augmentation with the use of Poisson latent variables. The method can be easily implemented and the asymptotic properties of the proposed estimators are established. A simulation study is conducted to assess the performance of the proposed method and indicates that it works well for practical situations. Also the method is applied to a set of interval-censored data from an AIDS cohort study.
Similar content being viewed by others
References
Bickel, P. J., Klaassen, C. A. J., Ritov, Y., & Wellner, J. A. (1993). Efficient and adaptive estimation for semiparametric models. Baltimore: Johns Hopkins University Press.
Chen, L., & Sun, J. (2010). A multiple imputation approach to the analysis of interval-censored failure time data with the additive hazards model. Computational Statistics and Data Analysis, 54, 1109–1116.
Chen, D., Sun, J., & Peace, K. E. (2012). Interval-censored time-to-event data: Methods and applications. Boca Raton: Chapman & Hall/CRC.
Cox, D. R. (1972). Regression models and life-tables. Journal of the Royal Statistical Society, Series B, 34, 187–220.
Ghosh, D. (2003). Goodness-of-fit methods for additive-risk models in tumorigenicity experiments. Biometrics, 59, 721–726.
Goedert, J. J., Kessler, C. M., Aledort, L. M., Biggar, R. J., Andes, W. A., Drummond, J. E., et al. (1989). A prospective study of human immunodeficiency virus type 1 infection and the development of aids in subjects with hemophilia. New England Journal of Medicine, 321, 1141–1148.
Groeneboom, P., & Wellner, J. A. (1992). Information bounds and nonparametric maximum likelihood estimation. DMV seminar band (Vol. 19). Basel: Birkhäuser.
Huang, J. (1996). Efficient estimation for the proportional hazards model with interval censoring. Annals of Statistics, 24, 540–568.
Hu, T., Cui, H., & Tong, X. (2009). Efficient estimation for semiparametric varying-coefficient partially linear regression models with current status data. Acta Mathematicae Applicatae Sinica-English Series, 25, 195–204.
Kalbfleisch, J. D., & Prentice, R. L. (2002). The statistical analysis of failure time data (2nd ed.). New York: Wiley.
Li, S., Hu, T., Wang, P., & Sun, J. (2018). A class of semiparametric transformation models for doubly censored failure time data. Scandinavian Journal of Statistics, 45, 682–698.
Lin, D. Y., Oakes, D., & Ying, Z. (1998). Additive hazards regression with current status data. Biometrika, 85, 289–298.
Lin, D. Y., & Ying, Z. (1994). Semiparametric analysis of the additive risk model. Biometrika, 81, 61–71.
Liu, W., Fang, J., & Lu, X. (2018). Additive-multiplicative hazards model with current status data. Computational Statistics, 33, 1245–1266.
Lorentz, G. G. (1986). Bernstein polynomials (2nd ed.). New York: Chelsea Publishing Co.
Louis, T. A. (1982). Finding the observed information matrix when using the em algorithm. Journal of the Royal Statistical Society, 44, 226–233.
Lu, X., & Song, X. K. (2012). On efficient estimation in additive hazards regression with current status data. Computational Statistics and Data Analysis, 56, 2051–2058.
Lu, X., & Song, X. K. (2015). Efficient estimation of the partly linear additive hazards model with current status data. Scandinavian Journal of Statistics, 42, 306–328.
Ma, L., Hu, T., & Sun, J. (2015). Sieve maximum likelihood regression analysis of dependent current status data. Biometrika, 102, 731–738.
Martinussen, T., & Scheike, T. H. (2002). Efficient estimation in additive hazards regression with current status data. Biometrika, 89, 649–658.
Murphy, S. A., & Vaart, A. W. V. D. (2000). On profile likelihood. Journal of the American Statistical Association, 95, 449–465.
Shen, X., & Wong, W. H. (1994). Convergence rate of sieve estimates. Annals of Statistics, 22, 580–615.
Sun, J. (2006). The statistical analysis of interval-censored failure time data. New York: Springer.
Turnbull, B. W. (1976). The empirical distribution function with arbitrarily grouped, censored and truncated data. Journal of the Royal Statistical Society Series B, 38, 290–295.
van der Vaart, A., & Wellner, J. (1996). Weak convergence and empirical processes with applications to statistics. New York: Springer.
Wang, L., Mcmahan, C. S., Hudgens, M. G., & Qureshi, Z. P. (2016). A flexible, computationally efficient method for fitting the proportional hazards model to interval-censored data. Biometrics, 72, 222–231.
Wang, L., Sun, J., & Tong, X. (2010). Regression analysis of case II interval-censored failure time data with the additive hazards model. Statistica Sinica, 20, 1709–1723.
Zeng, D., Cai, J., & Shen, Y. (2006). Semiparametric additive risks model for interval-censored data. Statistica Sinica, 16, 287–302.
Zeng, D., Lu, M., & Lin, D. Y. (2016). Maximum likelihood estimation for semiparametric transformation models with interval-censored data. Biometrika, 103, 253–271.
Zhang, Y., Hua, L., & Huang, J. (2010). A spline-based semiparametric maximum likelihood estimation method for the cox model with interval-censored data. Scandinavian Journal of Statistics, 37, 338–354.
Zhou, Q., Hu, T., & Sun, J. (2017). A sieve semiparametric maximum likelihood approach for regression analysis of bivariate interval-censored failure time data. Journal of the American Statistical Association, 112, 664–672.
Acknowledgements
The authors wish to thank the Associate Editor and the referees for their many helpful comments and suggestions that greatly improved the paper. Wang’s work was supported by the National Nature Science Foundation of China (11801212). Zhou’s work was supported by the State Key Program in the Major Research Plan of National Nature Science Foundation of China (91546202), and the State Key Program of National Nature Science Foundation of China (71931004). Sun’s work was supported by the National Nature Science Foundation of China (11671168).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix: Proofs of the Theorems
Appendix: Proofs of the Theorems
Proof of Theorem 1
Let \(l(\theta ; O)\) denote the log-likelihood function based on a single observation O and consider the class of functions \({\mathcal {L}}=\{l(\theta ; O):\theta \in \Theta _n\}\). Then, following the similar calculation by Zhang et al. (2010), the bracketing number of \({\mathcal {L}}\) is bounded by \(K(1/\varepsilon )^{K(m+1)+d}\). Hence \({\mathcal {L}}\) is Glivenlo–Cantelli by Theorem 2.4.1 of van der Vaart and Wellner (1996). Thus
Let \(M(\theta ;O)=-l(\theta ;O)\), and define \(K_\epsilon =\{\theta :d(\theta ,\theta _0)\ge \epsilon ,\theta \in \Theta _n\}\) for \(\epsilon >0\), \(\zeta _{1n}=\sup _{\theta \in \Theta _n}|P_nM(\theta ;O)-PM(\theta ;O)|\), and \(\zeta _{2n}=P_nM(\theta _0;O)-PM(\theta _0;O)\). Then
If \({\widehat{\theta }}_n\in K_{\epsilon }\), then
It follows from (3) and (4) that
and hence \(\zeta _{1n}+\zeta _{2n}\ge \delta _{\epsilon }>0\), where \(\delta _{\epsilon }=\inf _{K_{\epsilon }}PM(\theta ;O)-PM(\theta _0;O)\). This gives \(\{{\widehat{\theta }}_n\in K_{\epsilon }\}\subseteq \{\zeta _{1n}+\zeta _{2n}\ge \delta _{\epsilon }\}\). Together with strong law of large numbers, we have \(\zeta _{1n}+\zeta _{2n}\rightarrow 0\) almost surely. Therefore, \(d({\widehat{\theta }}_n,\theta _0)\rightarrow 0\) almost surely. \(\square \)
Proof of Theorem 2
We verify the conditions of Theorem 3.4.1 of van der Vaart and Wellner (1996) in order to derive the convergence rate. First note from Theorem 1.6.2 of Lorentz (1986) that there exists a Bernstein polynomial \(\Lambda _{n0}\) such that \(||\Lambda _{n0}-\Lambda ||_{\infty }=O(m^{-r/2})\). Define \(\theta _{n0}=(\beta _0,\Lambda _{n0})\). Then we have \(d(\theta _{n0},\theta _0)=O(n^{-r\nu /2})\).
For any \(\eta >0\), define the class of functions \({\mathcal {F}}_{\eta }=\{l(\theta ;O)-l(\theta _{n0};O):\theta \in \Theta _n, \eta /2<d(\theta ,\theta _{n0})\le \eta \}\). One can easily show that \(P(l(\theta _0;O)-l(\theta _{n0};O))\le K d^2(\theta _0,\theta _{n0})\le Kn^{-r\nu }\). Hence, we have for large n,
for any \(l(\theta ;O)-l(\theta _{n0};O)\in {\mathcal {F}}_{\eta }\).
Following the calculations in Shen and Wong (1994), we can derive that for \(0<\varepsilon <\eta \), \(\log N_{[]}(\varepsilon ,{\mathcal {F}}_{\eta },L_2(P))\le K(m+1)\log (\eta /\varepsilon )\). Under Conditions (3), (4) and (6), \({\mathcal {F}}_{\eta }\) is uniformly bounded. Together with \(P(l(\theta ;O)-l(\theta _{n0};O))^2\le K\eta ^2\) for \(l(\theta ;O)-l(\theta _{n0};O)\in {\mathcal {F}}_{\eta }\), by Lemma 3.4.2 of van der Vaart and Wellner (1996), one can obtain
where \(J_{[]}(\eta ,{\mathcal {F}}_{\eta },L_2(P))= \int _0^\eta (1+\log N_{[]}(\varepsilon ,{\mathcal {F}}_{\eta },L_2(P)))^{1/2}d\varepsilon \le K(m+1)^{1/2}\eta \). This yields \(\phi _n(\eta )=(m+1)^{1/2}\eta +(m+1)/n^{1/2}\). It is easy to see that if we choose \(r_n=n^{(1-\nu )/2}\), \(r_n^2\phi _n(1/r_n)\le K n^{1/2}\).
Together with the facts that \(P_n(l({\widehat{\theta }}_n;O)-l(\theta _{n0};O))\ge 0\) and \(d({\widehat{\theta }}_n,\theta _{n0})\le d({\widehat{\theta }}_n,\theta _{0})+d(\theta _0,\theta _{n0})\rightarrow 0\) in probability, by Theorem 3.4.1 of van der Vaart and Wellner (1996), one can get \(d({\widehat{\theta }}_n,\theta _{n0})=O_P(n^{-(1-\nu )/2})\). This together with \(d(\theta _{n0},\theta _0)=O(n^{-r\nu /2})\) yields that \(d({\widehat{\theta }}_n,\theta _{0})=O_P(n^{-(1-\nu )/2}+n^{-r\nu /2})\). \(\square \)
Proof of Theorem 3
We first calculate the information matrix based on the general semiparametric information theory described by Bickel et al. (1993). Define functions \(A_i, i=1,2,3\), by
and
By conditions (3) and (4), \(A_1\), \(A_2\) and \(A_3\) are positive functions of (u, v, z). Then the score function for \(\beta \) is
The score operator for \(\Lambda \) is
Let F is the distribution corresponding to \(\Lambda \) and P is the joint probability measure of \((\delta _1, \delta _2 , \delta _3, U , V , Z)\), then the score operator \({\dot{l}}_{\Lambda }\) maps \(L_2^0(F)\) to \(L_2^0(P)\), where \(L_2^0(F)\equiv \{a:\int a dF=0 \text{ and } \int a^2 dF<\infty \}\), and \(L_2^0(P)\) is defined similarly as \(L_2^0(F)\). Let \({\dot{l}}_{\Lambda }^T\) : \(L_2^0(P)\rightarrow L_2^0(F)\) be the adjoint operator of \({\dot{l}}_{\Lambda }\), i.e., for any \(a \in L_2^0(F)\) and \(b \in L_2^0(P)\),
where \(\langle \cdot ,\cdot \rangle _P\) and \(\langle \cdot ,\cdot \rangle _F\) are the inner products in \(L_2^0(P)\) and \(L_2^0(F)\), respectively. We need to find \(\phi ^*\) such that \({\dot{l}}_\beta -{\dot{l}}_{\Lambda }\phi ^*\) is orthogonal to \({\dot{l}}_{\Lambda }\phi \) in \(L_2^0(P)\). This amounts to solving the following normal equation:
First note that we have
by Groeneboom and Wellner (1992), pages 8–9, or Bickel et al. (1993), pages 271–272.
Let \(B_1(u,v) = E_Z[A_1(u,v,Z)g(u,v|Z)]\), \(B_2(u,v)= E_Z[A_2(u,v,Z)g(u,v|Z)]\), \(B_3(u,v) = E_Z[A_3(u,v,Z)g(u,v|Z)]\) and \(B_4(u,v) = E_Z[g(u,v|Z)]\). By the definition of A’s, B’s are all positive functions, and \(B_2(u,v) = B_3(u,v) + B_4(u,v)\). We calculate
Let \(C_1(u,v)=E_Z[ZA_1(u,v,Z)g(u,v|Z)]\), \(C_2(u,v)=E_Z[ZA_2(u,v,Z)g(u,v|Z)]\), \(C_3(u,v)=E_Z[ZA_3(u,v,Z)g(u,v|Z)]\) and \(C_4(u,v)=E_Z[Zg(u,v|Z)]\). Then further calculation yields
After some straightforward calculations, the derivative of L(t) is
where \(b(t) = \int _{t+\eta }^{\tau _1} [B_1(t,x)+B_2(t,x)] dx + \int _{\tau _0}^{t-\eta } [B_3(x,t)+B_4(x,t)] dx\). Similarly, the derivative of R(t) is
where \(c(t) = \int _{t+\eta }^{\tau _1} [C_1(t,x)+C_2(t,x)] dx + \int _{\tau _0}^{t-\eta } [C_3(x,t)+C_4(x,t)] dx\).
By conditions (3)–(7), r has a bounded derivative \(r'\) on \([\tau _0,\tau _1]\). So equation (5) reduces to
By conditions (3) and (4), we have \(\inf _{\tau _0\le t\le \tau _1}b(t)>0\). Let \(d(t)=-r(t)/b(t)\) and
Then \(\phi ^*(t)\) is the solution of a Fredholm integral equation of the second kind,
Therefore, the efficient score is \(l_\beta ^*={\dot{l}}_\beta -{\dot{l}}_{\Lambda }\phi ^*\) and the information is \(I(\beta )=E[l_\beta ^*]^{\otimes 2}\).
In the following, we will establish the asymptotic normality for \({\widehat{\beta }}_n\). Since \({\widehat{\theta }}_{n}\) maximizes the likelihood function, we have \(P_n {\dot{l}}_{\beta } ({\widehat{\theta }}_n)=0\). Since \(\phi ^*\) is obtained from the Fredholm integral equation given above, it has a bounded derivative and it is a function with bounded variation. Then there exists a \(\phi ^*_n\in {\mathcal {M}}\) such that \(P_n {\dot{l}}_{\Lambda }\phi ^*_n ({\widehat{\theta }}_n)=0\) and \(||\phi ^*_n-\phi ^*||=O(n^{-\nu })\). Then
Hence the first term \(I=o_p(n^{-1/2})\) is followed by uniform asymptotic equicontinuity of empirical processes indexed by a Donsker class of functions. By Theorem 2 and the Cauchy-Schwartz inequality, the second term II is \(o_{p}(n^{-1/2})\). Then \(P_n {\dot{l}}_{\Lambda }\phi ^* ({\widehat{\theta }}_n)=o_{p}(n^{-1/2})\).
By the calculation above, the Fisher information matrix for \(\beta \) is positive definite. By Theorem 2, the rate of convergence is proved. And there are two more facts. One is the uniform asymptotic equicontinuity of \((P_n-P) {\dot{l}}_{\beta } (\theta )\) and \((P_n-P) {\dot{l}}_{\Lambda }\phi ^* (\theta )\) in a small neighborhood of \(\theta _0\), and this follows from uniform asymptotic equicontinuity of empirical processes indexed by a Donsker class of functions. The other is the smoothness of \(P {\dot{l}}_{\beta } (\theta )\) and \(P {\dot{l}}_{\Lambda }\phi ^* (\theta )\) in a small neighborhood of \(\theta _0\), which follows from the Taylor expansion. Thus all conditions in Theorem 6.1 of Huang (1996) have been confirmed and the proof is complete. \(\square \)
Rights and permissions
About this article
Cite this article
Wang, P., Zhou, Y. & Sun, J. A new method for regression analysis of interval-censored data with the additive hazards model. J. Korean Stat. Soc. 49, 1131–1147 (2020). https://doi.org/10.1007/s42952-020-00051-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s42952-020-00051-y