Skip to main content
Log in

Joint analysis of interval-censored failure time data and panel count data

  • Published:
Lifetime Data Analysis Aims and scope Submit manuscript

Abstract

Interval-censored failure time data and panel count data are two types of incomplete data that commonly occur in event history studies and many methods have been developed for their analysis separately (Sun in The statistical analysis of interval-censored failure time data. Springer, New York, 2006; Sun and Zhao in The statistical analysis of panel count data. Springer, New York, 2013). Sometimes one may be interested in or need to conduct their joint analysis such as in the clinical trials with composite endpoints, for which it does not seem to exist an established approach in the literature. In this paper, a sieve maximum likelihood approach is developed for the joint analysis and in the proposed method, Bernstein polynomials are used to approximate unknown functions. The asymptotic properties of the resulting estimators are established and in particular, the proposed estimators of regression parameters are shown to be semiparametrically efficient. In addition, an extensive simulation study was conducted and the proposed method is applied to a set of real data arising from a skin cancer study.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Banerjee M, Sen B (2007) A pseudolikelihood method for analyzing interval censored data. Biometrika 94:71–86

    Article  MathSciNet  MATH  Google Scholar 

  • Burnham KP, Anderson DR (2003) Model selection and multimodel inference: a practical information-theoretic approach. Springer, Berlin

    MATH  Google Scholar 

  • Carnicer JM, Peña JM (1993) Shape preserving representations and optimality of the bernstein basis. Adv Comput Math 1:173–196

    Article  MathSciNet  MATH  Google Scholar 

  • Chen D, Sun J, Peace K (2012) Interval-censored time-to event data: methods and applications. Chapman & Hall/CRC

  • Ding Y, Nan B (2011) A sieve M-theorem for bundled parameters in semiparametric models, with application to the efficient estimation in a linear model for censored data. Ann Stat 39:3032–3061

    Article  MathSciNet  MATH  Google Scholar 

  • Finkelstein DM (1986) A proportional hazards model for interval-censored failure time data. Biometrics 42:845–854

    Article  MathSciNet  MATH  Google Scholar 

  • Groeneboom P, Wellner JA (1992) Information bounds and nonparametric maximum likelihood estimation. Springer, Berlin

    Book  MATH  Google Scholar 

  • Hua L, Zhang Y, Tu W (2014) A spline-based semiparametric sieve likelihood method for over-dispersed panel count data. Can J Stat 42(2):217–245

    Article  MathSciNet  MATH  Google Scholar 

  • Huang J, Rossini AJ (1997) Sieve estimation for the proportional-odds failure-time regression model with interval censoring. J Am Stat Assoc 92(439):960–967

    Article  MathSciNet  MATH  Google Scholar 

  • Lorentz GG (1986) Bernstein polynomials. Chelsea Publishing Co, New York

    MATH  Google Scholar 

  • Lu M, Zhang Y, Huang J (2009) Semiparametric estimation methods for panel count data using monotone B-splines. J Am Stat Assoc 104(487):1060–1070

    Article  MathSciNet  MATH  Google Scholar 

  • Ma L, Hu T, Sun J (2015) Sieve maximum likelihood regression analysis of dependent current status data. Biometrika 102(3):731–738

    Article  MathSciNet  MATH  Google Scholar 

  • Osman M, Ghosh SK (2012) Nonparametric regression models for right-censored data using Bernstein polynomials. Comput Stat Data Anal 56(3):559–573

    MathSciNet  MATH  Google Scholar 

  • Pollard D (1984) Convergence of stochastic processes. Springer, New York

    Book  MATH  Google Scholar 

  • Shen X (1997) On methods of sieves and penalization. Ann Stat 25(6):2555–2591

    Article  MathSciNet  MATH  Google Scholar 

  • Shen X, Wong WH (1994) Convergence rate of sieve estimates. Ann Stat 22:580–615

  • Sun J (2006) The statistical analysis of interval-censored failure time data. Springer, New York

    MATH  Google Scholar 

  • Sun J, Wei LJ (2000) Regression analysis of panel count data with covariate-dependent observation and censoring times. J R Stat Soc Ser B (Stat Methodol) 62(2):293–302

    Article  MathSciNet  Google Scholar 

  • Sun J, Zhao X (2013) The statistical analysis of panel count data. Springer, New York

    Book  MATH  Google Scholar 

  • Turnbull BW (1976) The empirical distribution function with arbitrarily grouped censored and truncated data. J R Stat Soc Ser B 38:290–295

    MathSciNet  MATH  Google Scholar 

  • van der Vaart AW, Wellner JA (1996) Weak convergence and empirical processes: with application to statistics. Springer

  • Wellner JA, Zhang Y (2000) Two estimators of the mean of a counting process with panel count data. Ann Stat 28:779–814

    Article  MathSciNet  MATH  Google Scholar 

  • Zhang Z, Sun J, Sun L (2005) Statistical analysis of current status data with informative observation times. Stat Med 24(9):1399–1407

    Article  MathSciNet  Google Scholar 

  • Zhang Y, Hua L, Huang J (2010) A spline-based semiparametric maximum likelihood estimation method for the Cox model with interval-censored data. Scand J Stat 37(2):338–354

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

The authors wish to thank the Editor-in-Chief, Dr. Mei-Ling Lee, an Associate Editor and two reviewers for their many helpful comments and suggestions. The work was partly supported by the National Nature Science Foundation of China grants 11471135 and 11571133 and the Central China Normal University grant MOE 15ZD011 to the second author and the NIH grant R21CA198641 to the third author.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jianguo Sun.

Appendix: Proofs of the asymptotic properties of \(\hat{\theta }_n\)

Appendix: Proofs of the asymptotic properties of \(\hat{\theta }_n\)

In this Appendix, we will sketch the proofs for the results given in Theorems 13 by using the empirical process theory (van der Vaart and Wellner 1996) some techniques commonly used in nonparametric literature.

First let \(X=\{X_1,\ldots , X_n\}\) denote the observed data and define \(\mathcal {L}_n=\{l(\theta , X): \theta \in \varTheta _n\}\).

Proof of Theorem 1

For any Proof of Theorem 1. For any \(\theta ^1=({\alpha ^1}', {\beta ^1}', \gamma ^1, \varLambda _1^1, \varLambda _2^1)'\) and \(\theta ^2=({\alpha ^2}', {\beta ^2}', \gamma ^2, \varLambda _1^2, \varLambda _2^2)'\in \varTheta _n\), it is easy to show that

$$\begin{aligned} \begin{aligned}&|l(\theta ^1, X)-l(\theta ^2, X)|\\ \le&K(\Vert \alpha ^1-\alpha ^2\Vert +\Vert \beta ^1-\beta ^2\Vert +\Vert \gamma ^1-\gamma ^2\Vert + \Vert \varLambda _1^1-\varLambda _1^2\Vert _{\infty }+\Vert \varLambda _2^1-\varLambda _2^2\Vert _{\infty }) \end{aligned} \end{aligned}$$
(1)

by the Taylor’s series expansion. According to the conclusion in page 94 of van der Vaart and Wellner (1996), we can show that the covering number of \(\mathcal {L}_n\) satisfies

$$\begin{aligned}&N\Big (\epsilon , \mathcal {L}_n, L_1(P_n)\Big )\\&\quad \le N\Big (\dfrac{\epsilon }{3M},\mathcal {B}, \Vert \cdot \Vert \Big )\cdot N\Big (\dfrac{\epsilon }{3M_n},\mathcal {M}_n^1, L_\infty \Big )\cdot N\Big (\dfrac{\epsilon }{3M_n},\mathcal {M}_n^2, L_\infty \Big )\\&\quad \le \Big (\dfrac{9M^2}{\epsilon }\Big )^{2p+1}\cdot \Big (\dfrac{9M_n^2}{\epsilon }\Big )^{m+1} \Big (\dfrac{9M_n^2}{\epsilon }\Big )^{m+1}\\&\quad \le K M^{4p+2}M_n^{4m+4}\epsilon ^{-p_m} , \end{aligned}$$

where \(p_m=2p+2m+3\). Then it follows from the inequality (31) of Pollard (1984)(page 31) that

$$\begin{aligned} \mathop {\mathrm{sup}}\limits _{\theta \in \varTheta _n}|P_n l(\theta , X)-P l(\theta , X)|\rightarrow 0 \end{aligned}$$
(2)

in probability. Define \(\varTheta _\epsilon =\{\theta : d(\theta , \theta _0)\ge \epsilon , \theta \in \varTheta _n\}\) and let \(J(\theta , X)=-l(\theta , X)\), \(\zeta _{1n}=\mathop {\mathrm{sup}}\limits _{\theta \in \varTheta _n}|P_n J(\theta , X)-P J(\theta , X)| \), and \(\zeta _{2n}= P_n J(\theta _0, X)-P J(\theta _0, X)\). Then

$$\begin{aligned} \mathop {\mathrm{inf}}\limits _{\varTheta _\epsilon } P J(\theta , X)= & {} \mathop {\mathrm{inf}}\limits _{\varTheta _\epsilon }\{P J(\theta , X)-P_n J(\theta , X)+P_n J(\theta , X)\} \\\le & {} \zeta _{1n}+\mathop {\mathrm{inf}}\limits _{\varTheta _\epsilon }P_n J(\theta , X). \end{aligned}$$

If \({\hat{\theta }}_n\in \varTheta _\epsilon \), we have

$$\begin{aligned} \mathop {\mathrm{inf}}\limits _{\varTheta _\epsilon } P_n J(\theta , X) = P_n J({\hat{\theta }}_n, X)\le P_n J(\theta _0, X) = \zeta _{2n} + P J(\theta _0, X). \end{aligned}$$

By Condition 4, we have that \(\mathop {\mathrm{inf}}\limits _{\varTheta _\epsilon } P J(\theta , X)-P J(\theta _0, X) = \delta _\epsilon >0\), Thus,

$$\begin{aligned} \mathop {\mathrm{inf}}\limits _{\varTheta _\epsilon } P J(\theta , X)\le \zeta _{1n}+\zeta _{2n}+P J(\theta _0, X)=\zeta _{n}+P J(\theta _0, X) \end{aligned}$$

with \(\zeta _{n}=\zeta _{1n}+\zeta _{2n}\). Hence we obtain that \(\zeta _{n}\ge \delta _\epsilon \) and furthermore \(\{{\hat{\theta }}_n\in \varTheta _\epsilon \}\subseteq \{\zeta _n\ge \delta _\epsilon \}\). By (2) and Strong Law of Large Numbers, we have \(\zeta _{1n}=o(1)\), \(\zeta _{2n}=o(1)\) and then \(\zeta _{n}=o(1)\) almost surely. Therefore, \(\cup _{k=1}^\infty \cap _{n=k}^\infty \{{\hat{\theta }}_n\in \varTheta _\epsilon \}\subseteq \cup _{k=1}^\infty \cap _{n=k}^\infty \{\zeta _n\ge \delta _\epsilon \}\), which shows the strong consistency of \({\hat{\theta }}_n\). \(\square \)

Proof of Theorem 2

To establish the convergence rate, note that by the Theorem 1.6.2 of Lorentz (1986) or the proof of Theorem 2 in Osman and Ghosh (2012), if \(m=o(n^{\nu })\), there exist Bernstein polynomials \(\varLambda _{1n0}\) and \(\varLambda _{2n0}\) such that \(\Vert \varLambda _{1n0}-\varLambda _{10}\Vert _\infty =O(n^{-\frac{r\nu }{2}})\) and \(\Vert \varLambda _{2n0}-\varLambda _{20}\Vert _\infty =O(n^{-\frac{r\nu }{2}})\), respectively. For any \(\eta \), define the class \(\mathcal {F}_\eta =\{ l(\theta _{n0}, X) - l(\theta , X): \theta \in \varTheta _n, d(\theta -\theta _{n0})<\eta \}\) with \(\theta _{n0}=(\beta '_0, \alpha '_0, \gamma '_0, \varLambda _{1n0}, \varLambda _{2n0})'\). Following the calculation of Shen and Wong (1994, p. 597), we can establish that \(\log N_{[]}(\varepsilon , \mathcal {F}_\eta , \Vert \cdot \Vert _2)\le C N \log (\eta /\varepsilon )\) with \(N=2(m+1)\). Moreover, some algebraic calculations lead to \(\Vert l(\theta _{n0}, X)-l(\theta , X)\Vert ^2_2\le C \eta ^2\) for any \(l(\theta _{n0}, X)-l(\theta , X)\in \mathcal {F}_\eta \). Therefore it follows from Lemma 3.4.2 of van der Vaart and Wellner (1996) that

$$\begin{aligned} E_P \Vert n^{1/2} (P_n-P)\Vert _{\mathcal {F}_\eta }\le C J_{\eta }(\varepsilon , \mathcal {F}_\eta , \Vert \cdot \Vert _2)\Big \{1+\dfrac{J_{\eta }(\varepsilon , \mathcal {F}_\eta , \Vert \cdot \Vert _2)}{\eta ^2 n^{1/2}}\Big \}, \end{aligned}$$
(3)

where \(J_{\eta }(\varepsilon , \mathcal {F}_\eta , \Vert \cdot \Vert _2)=\int _{0}^\eta \{1+\log N_{[]}(\varepsilon , \mathcal {F}_\eta , \Vert \cdot \Vert _2)\}^{1/2} d\varepsilon \le C N^{1/2} \eta \).

Note that the right-hand side of (3) gives \(\phi _n (\eta )= C (N^{1/2}\eta + N/n^{1/2})\). Also it is easy to see that \(\phi _n(\eta )/\eta \) decreases in \(\eta \) and \(r_n^2\phi _n(1/r_n)=r_n N^{1/2}+ r_n^2 N/n^{1/2}< 2 n^{1/2}\), where \(r_n=N^{-1/2}n^{1/2}=n^ {(1-\nu )/2}\) with \(0<\nu < 0.5\). Hence \(n^{(1-\nu )/2} d({\hat{\theta }}_n, \theta _{n0})=O_p(1)\) by Theorem 3.4.1 of van der Vaart and Wellner (1996). This, together with \(d(\theta _{n0}, \theta _0)=O_p(n^{-r\nu /2})\) , yields that \(d({\hat{\theta }}_n, \theta _0)=O_p(n^{-(1-\nu )/2}+n^{-r\nu /2})\). The choice of \(\nu =1/(1+r)\) yields the rate of convergence \(d({\hat{\theta }}_n, \theta _0)=O_p(n^{-r/(2+2r)})\).

Proof of Theorem 3

As above, let \(\theta _0\) denote the true value of parameter \(\theta \) and define V to be the linear span of \(\varTheta - \theta _0\). Also let \(l(\theta , X)\) be the log-likelihood for a sample of size one and \(\delta _n = n^{-(1-\nu )/2}+n^{-r\nu /2}\). For any \(\theta \in \{\theta \in \varTheta : \Vert \theta -\theta _0\Vert =O(\delta _n)\}\), define the first order directional derivative of \(l(\theta , X)\) at the direction \(v\in V\) as

$$\begin{aligned} \dot{l}(\theta , X)[v]=\dfrac{d l(\theta + sv, X)}{ds}|_{s=0}, \end{aligned}$$

and the second order directional derivative as

$$\begin{aligned} \ddot{l}(\theta , X)[v,\tilde{v}]=\dfrac{d\dot{l}((\theta + \tilde{s} \tilde{v}, X))}{d\tilde{s}}|_{\tilde{s}=0}=\dfrac{d^2 l(\theta + sv+ \tilde{s} \tilde{v}, X)}{d\tilde{s}ds}|_{\tilde{s}=s=0}. \end{aligned}$$

Also define the Fisher inner product on the space V as

$$\begin{aligned} <v,\tilde{v}>=P\{ \dot{l}(\theta , X)[v]\,\dot{l}(\theta , X)[\tilde{v}] \} \end{aligned}$$

and the Fisher norm for \(v\in V\) as \(\Vert v\Vert ^{1/2} = < v, v>\). Let \(\bar{V}\) be the closed linear span of V under the Fisher norm. Then \((\bar{V},\Vert \cdot \Vert )\) is a Hilbert space.

Furthermore, define the smooth functional of \(\theta \) as

$$\begin{aligned} \psi (\theta ) = b'_1 \alpha + b'_2\beta +b_3 \gamma , \end{aligned}$$

where \(b=(b'_1, b'_2, b_3)'\) is any vector of \(2p+1\) dimension with \(\Vert b\Vert \le 1\). For any \(v\in V\), we denote

$$\begin{aligned} \dot{\psi }(\theta _0)[v]=\dfrac{d \psi (\theta _0+sv)}{ds}|_{s=0}. \end{aligned}$$

Note that \(\psi (\theta )-\psi (\theta _0)=\dot{\psi }(\theta _0)[\theta -\theta _0]\). It follows from the Riesz representation theorem that there exists \(v^* \in \bar{V}\) such that \(\dot{\psi }(\theta _0)[v]=<v^*, v>\) for all \(v\in \bar{V}\) and \(\Vert v^*\Vert ^2=\Vert \dot{\psi }(\theta _0)\Vert \). Thus it follows from the Cramér-Wold device that to prove Theorem 3, it suffices to show that

$$\begin{aligned} n^{1/2}<{\hat{\theta }}_n -\theta _0, v^*> \rightarrow N(0,b'\Sigma b)\;\; \text{ in } \text{ distribution } \end{aligned}$$
(4)

since \(b'\{({\hat{\alpha }}_n -\alpha _0)', ({\hat{\beta }}_n -\beta _0)', ({\hat{\gamma }}_n -\gamma _0)\}'=\psi ({\hat{\theta }}_n)-\psi (\theta _0)=\dot{\psi }(\theta _0)[{\hat{\theta }}_n -\theta _0]=<{\hat{\theta }}_n -\theta _0, v^*>\). In fact, (4) can be proved using the similar arguments of Theorem 1 of Shen (1997). For each component \(\vartheta _q\) of \(\vartheta \), \(q=1,2,\cdots ,2p+1,\) we denote by \(\zeta ^*_q=(b_{1q}^*,b_{2q}^*)\) the solution to

$$\begin{aligned} \inf _{\zeta _q}E\Big \{l_\vartheta \cdot e_q-l_{b_1}[b_{1q}]-l_{b_2} [ b_{2q}]\Big \}^2 , \end{aligned}$$

where \(l_\vartheta =(l_\alpha ',l_\beta ', l_\gamma )', \) and \(e_q\) is a \((2p+1)\)-dimensional vector of zeros except the q-th element equal to 1. \(l_{b_1}[b_1]\) and \(l_{b_2}[b_2]\) are the directional derivatives with respect to \(\varLambda _1\) and \(\varLambda _2\) and can be calculated as directional derivatives defined at the beginning of the proof of Theorem 3. Now let \(\zeta ^*=(\zeta _1^*,\cdots ,\zeta _{2p+1}^*).\) By the calculations of Chen et al. (2012), we have \(\Vert v^*\Vert ^2=\Vert \dot{\psi }(\theta _0)\Vert =\sup _{v\in \bar{V}:\Vert v\Vert >0}\frac{|\dot{\psi }(\theta _0)[v]|}{\Vert v\Vert }=b'\Sigma b,\) where \(\Sigma =[E(S_\vartheta S_\vartheta ')]^{-1},\) \(S_\vartheta =\{l_\vartheta -l_{b_1^*}[b_1^*]-l_{b_2^*}[b_2^*]\}.\) Hence the semiparametric efficiency can be established by applying the result of Theorem 4 in Shen (1997), which completes the proof.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xu, D., Zhao, H. & Sun, J. Joint analysis of interval-censored failure time data and panel count data. Lifetime Data Anal 24, 94–109 (2018). https://doi.org/10.1007/s10985-017-9397-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10985-017-9397-0

Keywords

Navigation