Abstract
Interval-censored failure time data and panel count data are two types of incomplete data that commonly occur in event history studies and many methods have been developed for their analysis separately (Sun in The statistical analysis of interval-censored failure time data. Springer, New York, 2006; Sun and Zhao in The statistical analysis of panel count data. Springer, New York, 2013). Sometimes one may be interested in or need to conduct their joint analysis such as in the clinical trials with composite endpoints, for which it does not seem to exist an established approach in the literature. In this paper, a sieve maximum likelihood approach is developed for the joint analysis and in the proposed method, Bernstein polynomials are used to approximate unknown functions. The asymptotic properties of the resulting estimators are established and in particular, the proposed estimators of regression parameters are shown to be semiparametrically efficient. In addition, an extensive simulation study was conducted and the proposed method is applied to a set of real data arising from a skin cancer study.
Similar content being viewed by others
References
Banerjee M, Sen B (2007) A pseudolikelihood method for analyzing interval censored data. Biometrika 94:71–86
Burnham KP, Anderson DR (2003) Model selection and multimodel inference: a practical information-theoretic approach. Springer, Berlin
Carnicer JM, Peña JM (1993) Shape preserving representations and optimality of the bernstein basis. Adv Comput Math 1:173–196
Chen D, Sun J, Peace K (2012) Interval-censored time-to event data: methods and applications. Chapman & Hall/CRC
Ding Y, Nan B (2011) A sieve M-theorem for bundled parameters in semiparametric models, with application to the efficient estimation in a linear model for censored data. Ann Stat 39:3032–3061
Finkelstein DM (1986) A proportional hazards model for interval-censored failure time data. Biometrics 42:845–854
Groeneboom P, Wellner JA (1992) Information bounds and nonparametric maximum likelihood estimation. Springer, Berlin
Hua L, Zhang Y, Tu W (2014) A spline-based semiparametric sieve likelihood method for over-dispersed panel count data. Can J Stat 42(2):217–245
Huang J, Rossini AJ (1997) Sieve estimation for the proportional-odds failure-time regression model with interval censoring. J Am Stat Assoc 92(439):960–967
Lorentz GG (1986) Bernstein polynomials. Chelsea Publishing Co, New York
Lu M, Zhang Y, Huang J (2009) Semiparametric estimation methods for panel count data using monotone B-splines. J Am Stat Assoc 104(487):1060–1070
Ma L, Hu T, Sun J (2015) Sieve maximum likelihood regression analysis of dependent current status data. Biometrika 102(3):731–738
Osman M, Ghosh SK (2012) Nonparametric regression models for right-censored data using Bernstein polynomials. Comput Stat Data Anal 56(3):559–573
Pollard D (1984) Convergence of stochastic processes. Springer, New York
Shen X (1997) On methods of sieves and penalization. Ann Stat 25(6):2555–2591
Shen X, Wong WH (1994) Convergence rate of sieve estimates. Ann Stat 22:580–615
Sun J (2006) The statistical analysis of interval-censored failure time data. Springer, New York
Sun J, Wei LJ (2000) Regression analysis of panel count data with covariate-dependent observation and censoring times. J R Stat Soc Ser B (Stat Methodol) 62(2):293–302
Sun J, Zhao X (2013) The statistical analysis of panel count data. Springer, New York
Turnbull BW (1976) The empirical distribution function with arbitrarily grouped censored and truncated data. J R Stat Soc Ser B 38:290–295
van der Vaart AW, Wellner JA (1996) Weak convergence and empirical processes: with application to statistics. Springer
Wellner JA, Zhang Y (2000) Two estimators of the mean of a counting process with panel count data. Ann Stat 28:779–814
Zhang Z, Sun J, Sun L (2005) Statistical analysis of current status data with informative observation times. Stat Med 24(9):1399–1407
Zhang Y, Hua L, Huang J (2010) A spline-based semiparametric maximum likelihood estimation method for the Cox model with interval-censored data. Scand J Stat 37(2):338–354
Acknowledgements
The authors wish to thank the Editor-in-Chief, Dr. Mei-Ling Lee, an Associate Editor and two reviewers for their many helpful comments and suggestions. The work was partly supported by the National Nature Science Foundation of China grants 11471135 and 11571133 and the Central China Normal University grant MOE 15ZD011 to the second author and the NIH grant R21CA198641 to the third author.
Author information
Authors and Affiliations
Corresponding author
Appendix: Proofs of the asymptotic properties of \(\hat{\theta }_n\)
Appendix: Proofs of the asymptotic properties of \(\hat{\theta }_n\)
In this Appendix, we will sketch the proofs for the results given in Theorems 1–3 by using the empirical process theory (van der Vaart and Wellner 1996) some techniques commonly used in nonparametric literature.
First let \(X=\{X_1,\ldots , X_n\}\) denote the observed data and define \(\mathcal {L}_n=\{l(\theta , X): \theta \in \varTheta _n\}\).
Proof of Theorem 1
For any Proof of Theorem 1. For any \(\theta ^1=({\alpha ^1}', {\beta ^1}', \gamma ^1, \varLambda _1^1, \varLambda _2^1)'\) and \(\theta ^2=({\alpha ^2}', {\beta ^2}', \gamma ^2, \varLambda _1^2, \varLambda _2^2)'\in \varTheta _n\), it is easy to show that
by the Taylor’s series expansion. According to the conclusion in page 94 of van der Vaart and Wellner (1996), we can show that the covering number of \(\mathcal {L}_n\) satisfies
where \(p_m=2p+2m+3\). Then it follows from the inequality (31) of Pollard (1984)(page 31) that
in probability. Define \(\varTheta _\epsilon =\{\theta : d(\theta , \theta _0)\ge \epsilon , \theta \in \varTheta _n\}\) and let \(J(\theta , X)=-l(\theta , X)\), \(\zeta _{1n}=\mathop {\mathrm{sup}}\limits _{\theta \in \varTheta _n}|P_n J(\theta , X)-P J(\theta , X)| \), and \(\zeta _{2n}= P_n J(\theta _0, X)-P J(\theta _0, X)\). Then
If \({\hat{\theta }}_n\in \varTheta _\epsilon \), we have
By Condition 4, we have that \(\mathop {\mathrm{inf}}\limits _{\varTheta _\epsilon } P J(\theta , X)-P J(\theta _0, X) = \delta _\epsilon >0\), Thus,
with \(\zeta _{n}=\zeta _{1n}+\zeta _{2n}\). Hence we obtain that \(\zeta _{n}\ge \delta _\epsilon \) and furthermore \(\{{\hat{\theta }}_n\in \varTheta _\epsilon \}\subseteq \{\zeta _n\ge \delta _\epsilon \}\). By (2) and Strong Law of Large Numbers, we have \(\zeta _{1n}=o(1)\), \(\zeta _{2n}=o(1)\) and then \(\zeta _{n}=o(1)\) almost surely. Therefore, \(\cup _{k=1}^\infty \cap _{n=k}^\infty \{{\hat{\theta }}_n\in \varTheta _\epsilon \}\subseteq \cup _{k=1}^\infty \cap _{n=k}^\infty \{\zeta _n\ge \delta _\epsilon \}\), which shows the strong consistency of \({\hat{\theta }}_n\). \(\square \)
Proof of Theorem 2
To establish the convergence rate, note that by the Theorem 1.6.2 of Lorentz (1986) or the proof of Theorem 2 in Osman and Ghosh (2012), if \(m=o(n^{\nu })\), there exist Bernstein polynomials \(\varLambda _{1n0}\) and \(\varLambda _{2n0}\) such that \(\Vert \varLambda _{1n0}-\varLambda _{10}\Vert _\infty =O(n^{-\frac{r\nu }{2}})\) and \(\Vert \varLambda _{2n0}-\varLambda _{20}\Vert _\infty =O(n^{-\frac{r\nu }{2}})\), respectively. For any \(\eta \), define the class \(\mathcal {F}_\eta =\{ l(\theta _{n0}, X) - l(\theta , X): \theta \in \varTheta _n, d(\theta -\theta _{n0})<\eta \}\) with \(\theta _{n0}=(\beta '_0, \alpha '_0, \gamma '_0, \varLambda _{1n0}, \varLambda _{2n0})'\). Following the calculation of Shen and Wong (1994, p. 597), we can establish that \(\log N_{[]}(\varepsilon , \mathcal {F}_\eta , \Vert \cdot \Vert _2)\le C N \log (\eta /\varepsilon )\) with \(N=2(m+1)\). Moreover, some algebraic calculations lead to \(\Vert l(\theta _{n0}, X)-l(\theta , X)\Vert ^2_2\le C \eta ^2\) for any \(l(\theta _{n0}, X)-l(\theta , X)\in \mathcal {F}_\eta \). Therefore it follows from Lemma 3.4.2 of van der Vaart and Wellner (1996) that
where \(J_{\eta }(\varepsilon , \mathcal {F}_\eta , \Vert \cdot \Vert _2)=\int _{0}^\eta \{1+\log N_{[]}(\varepsilon , \mathcal {F}_\eta , \Vert \cdot \Vert _2)\}^{1/2} d\varepsilon \le C N^{1/2} \eta \).
Note that the right-hand side of (3) gives \(\phi _n (\eta )= C (N^{1/2}\eta + N/n^{1/2})\). Also it is easy to see that \(\phi _n(\eta )/\eta \) decreases in \(\eta \) and \(r_n^2\phi _n(1/r_n)=r_n N^{1/2}+ r_n^2 N/n^{1/2}< 2 n^{1/2}\), where \(r_n=N^{-1/2}n^{1/2}=n^ {(1-\nu )/2}\) with \(0<\nu < 0.5\). Hence \(n^{(1-\nu )/2} d({\hat{\theta }}_n, \theta _{n0})=O_p(1)\) by Theorem 3.4.1 of van der Vaart and Wellner (1996). This, together with \(d(\theta _{n0}, \theta _0)=O_p(n^{-r\nu /2})\) , yields that \(d({\hat{\theta }}_n, \theta _0)=O_p(n^{-(1-\nu )/2}+n^{-r\nu /2})\). The choice of \(\nu =1/(1+r)\) yields the rate of convergence \(d({\hat{\theta }}_n, \theta _0)=O_p(n^{-r/(2+2r)})\).
Proof of Theorem 3
As above, let \(\theta _0\) denote the true value of parameter \(\theta \) and define V to be the linear span of \(\varTheta - \theta _0\). Also let \(l(\theta , X)\) be the log-likelihood for a sample of size one and \(\delta _n = n^{-(1-\nu )/2}+n^{-r\nu /2}\). For any \(\theta \in \{\theta \in \varTheta : \Vert \theta -\theta _0\Vert =O(\delta _n)\}\), define the first order directional derivative of \(l(\theta , X)\) at the direction \(v\in V\) as
and the second order directional derivative as
Also define the Fisher inner product on the space V as
and the Fisher norm for \(v\in V\) as \(\Vert v\Vert ^{1/2} = < v, v>\). Let \(\bar{V}\) be the closed linear span of V under the Fisher norm. Then \((\bar{V},\Vert \cdot \Vert )\) is a Hilbert space.
Furthermore, define the smooth functional of \(\theta \) as
where \(b=(b'_1, b'_2, b_3)'\) is any vector of \(2p+1\) dimension with \(\Vert b\Vert \le 1\). For any \(v\in V\), we denote
Note that \(\psi (\theta )-\psi (\theta _0)=\dot{\psi }(\theta _0)[\theta -\theta _0]\). It follows from the Riesz representation theorem that there exists \(v^* \in \bar{V}\) such that \(\dot{\psi }(\theta _0)[v]=<v^*, v>\) for all \(v\in \bar{V}\) and \(\Vert v^*\Vert ^2=\Vert \dot{\psi }(\theta _0)\Vert \). Thus it follows from the Cramér-Wold device that to prove Theorem 3, it suffices to show that
since \(b'\{({\hat{\alpha }}_n -\alpha _0)', ({\hat{\beta }}_n -\beta _0)', ({\hat{\gamma }}_n -\gamma _0)\}'=\psi ({\hat{\theta }}_n)-\psi (\theta _0)=\dot{\psi }(\theta _0)[{\hat{\theta }}_n -\theta _0]=<{\hat{\theta }}_n -\theta _0, v^*>\). In fact, (4) can be proved using the similar arguments of Theorem 1 of Shen (1997). For each component \(\vartheta _q\) of \(\vartheta \), \(q=1,2,\cdots ,2p+1,\) we denote by \(\zeta ^*_q=(b_{1q}^*,b_{2q}^*)\) the solution to
where \(l_\vartheta =(l_\alpha ',l_\beta ', l_\gamma )', \) and \(e_q\) is a \((2p+1)\)-dimensional vector of zeros except the q-th element equal to 1. \(l_{b_1}[b_1]\) and \(l_{b_2}[b_2]\) are the directional derivatives with respect to \(\varLambda _1\) and \(\varLambda _2\) and can be calculated as directional derivatives defined at the beginning of the proof of Theorem 3. Now let \(\zeta ^*=(\zeta _1^*,\cdots ,\zeta _{2p+1}^*).\) By the calculations of Chen et al. (2012), we have \(\Vert v^*\Vert ^2=\Vert \dot{\psi }(\theta _0)\Vert =\sup _{v\in \bar{V}:\Vert v\Vert >0}\frac{|\dot{\psi }(\theta _0)[v]|}{\Vert v\Vert }=b'\Sigma b,\) where \(\Sigma =[E(S_\vartheta S_\vartheta ')]^{-1},\) \(S_\vartheta =\{l_\vartheta -l_{b_1^*}[b_1^*]-l_{b_2^*}[b_2^*]\}.\) Hence the semiparametric efficiency can be established by applying the result of Theorem 4 in Shen (1997), which completes the proof.
Rights and permissions
About this article
Cite this article
Xu, D., Zhao, H. & Sun, J. Joint analysis of interval-censored failure time data and panel count data. Lifetime Data Anal 24, 94–109 (2018). https://doi.org/10.1007/s10985-017-9397-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10985-017-9397-0