Skip to main content
Log in

A New Approach for Regression Analysis of Multivariate Current Status Data with Informative Censoring

  • Published:
Communications in Mathematics and Statistics Aims and scope Submit manuscript

Abstract

Regression analysis of interval-censored failure time data has recently attracted a great deal of attention partly due to their increasing occurrences in many fields. In this paper, we discuss a type of such data, multivariate current status data, where in addition to the complex interval data structure, one also faces dependent or informative censoring. For inference, a sieve maximum likelihood estimation procedure is developed and the proposed estimators of regression parameters are shown to be asymptotically consistent and efficient. For the implementation of the method, an EM algorithm is provided, and the results from an extensive simulation study demonstrate the validity and good performance of the proposed inference procedure. For an illustration, the proposed approach is applied to a tumorigenicity experiment.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Chang, I.S., Wen, C.C., Wu, Y.J.: A profile likelihood theory for the correlated gamma-frailty model with current status family data. Statistica Sinica 17, 1023–1046(2007)

  2. Chen, C.M., Lu, T.F.C., Chen, M.H., Hsu, C.M.: Semiparametric transformation models for current status data with informative censoring. Biom. J. 19, 641–656 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  3. Chen, C.M., Wei, J.C., Hsu, C.M., Lee, M.Y.: Regression analysis of multivariate current status data with dependent censoring: application to ankylosing spondylitis data. Stat. Med. 33, 772–785 (2014)

    Article  MathSciNet  Google Scholar 

  4. Chen, M.H., Tong, X.W., Sun, J.: The proportional odds model for multivariate interval-censored failure time data. Stat. Med. 26, 5147–5161 (2007)

    Article  MathSciNet  Google Scholar 

  5. Cox, D.R.: Regression analysis and life tables (with discussion). J. R. Stat. Soc. B 34, 187–220 (1972)

    Google Scholar 

  6. Dunson, D.B., Dinse, G.E.: Bayesian models for multivariate current status data with informative censoring. Biometrics 58, 79–88 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  7. Efron, B.: Censored data and the bootstrap. J. Am. Stat. Assoc. 76, 312–319 (1981)

    Article  MathSciNet  MATH  Google Scholar 

  8. Finkelstein, D.M.: A proportional hazards model for interval-censored failure time data. Biometrics 42, 845–854 (1986)

    Article  MathSciNet  MATH  Google Scholar 

  9. Goggins, W.B., Finkelstein, D.M.: A proportional hazards model for multivariate interval-censored failure time data. Biometrics 56, 940–943 (2000)

    Article  MATH  Google Scholar 

  10. Guo, G., Rodriguez, G.: Estimating a multivariate proportional hazards model for clustered data using the EM algorithm, with an application to child survival in Guatemala. J. Am. Stat. Assoc. 87, 969–976 (1992)

    Article  Google Scholar 

  11. Hu, T., Zhou, Q., Sun, J.: Regression analysis of bivariate current status data under the proportional hazards model. Can. J. Stat. 45, 410–424 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  12. Jewell, N.P., van der Laan, M.J., Lei, X.: Bivariate current status data with univariate monitoring times. Biometrika 92, 847–862 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  13. Kalbfleisch, J.D., Prentice, R.L.: The Statistical Analysis of Failure Time Data, 2nd edn. Wiley, New York (2002)

    Book  MATH  Google Scholar 

  14. Li, S.W., Hu, T., Wang, P.J., Sun, J.: Regression analysis of current status data in the presence of dependent censoring with applications to tumorigenicity experiments. Comput. Stat. Data Anal. 110, 75–86 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  15. Lin, D.Y., Oakes, D., Ying, Z.: Additive hazards regression with current status data. Biometrika 85, 289–298 (1998)

    Article  MathSciNet  MATH  Google Scholar 

  16. Liu, Y.Q., Hu, T., Sun, J.: Regression analysis of current status data in the presence of a cured subgroup and dependent censoring. Lifetime Data Anal. 23, 626–650 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  17. Lu, M., Zhang, Y., Huang, J.: Estimation of the mean function with panel count data using monotone polymial splines. Biometrika 94, 705–706 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  18. Ma, L., Hu, T., Sun, J.: Sieve maximum likelihood regression analysis of dependent current status data. Biometrika 85, 649–658 (2015)

    MathSciNet  MATH  Google Scholar 

  19. National Toxicology Program: Toxicology and carcinogenesis studies of chloroprene (case no. 126-99-8) in \(F344/N\) rats and \(B6C3F_1\) mice (inhalation studies). Technical Report 467. U.S. Department of Health and Human Services, Public Health Service, National Institutes of Health, Bethesda, MD (1998)

  20. Pakes, A., Pollard, D.: simulation and the asymptotic of optimization estimators. Econometrica 57, 1027–1057 (1989)

    Article  MathSciNet  MATH  Google Scholar 

  21. Ramsay, J.O.: Monotone regression splines in action. Stat. Sci. 3, 425–441 (1988)

    Google Scholar 

  22. Shen, X., Wrong, W.: Convergence rate of sieve estimates. Ann. Stat. 57, 580–615 (1994)

    MathSciNet  MATH  Google Scholar 

  23. Su, Y.R., Wang, J.L.: Semiparametric efficient estimation for shared-frailty models with doubly-censored clustered data. Ann. Stat. 44, 1298–1331 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  24. Sun, J.: The Statistical Analysis of Interval-Censored Failure Time Data. Springer, New York (2006)

    MATH  Google Scholar 

  25. Van Der Vaart, A.W.: Asymptotic Statistics. Cambridge University Press, New York (1998)

    Book  MATH  Google Scholar 

  26. Van Der Vaart, A.W., Wellner, J.A.: Weak Convergence and Empirical Processes. Springer, New York (1996)

    Book  MATH  Google Scholar 

  27. Wang, N., Wang, L., McMahan, C.S.: Regression analysis of bivariate current status data under the Gamma-frailty proportional hazards model using the EM algorithm. Comput. Stat. Data Anal. 83, 140–150 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  28. Wen, C.C., Chen, Y.H.: Nonparametric maximum likelihood analysis of clustered current status data with the gamma-frailty Cox model. Comput. Stat. Data Anal. 83, 140–150 (2011)

    MathSciNet  MATH  Google Scholar 

  29. Wei, L.J., Lin, D.Y., Weissfeld, L.: Regression analysis of multivariate incomplete failure time data by modeling marginal distributions. J. Am. Stat. Assoc. 84, 1065–1073 (1989)

    Article  MathSciNet  Google Scholar 

  30. Zhang, Z., Sun, J., Sun, L.: Statistical analysis of current data with informative observation times. Stat. Med. 24, 1399–1407 (2005)

    Article  MathSciNet  Google Scholar 

  31. Zhao, S., Hu, T., Ma, L., Wang, P., Sun, J.: Regression analysis of informative current status data with the additive hazards model. Lifetime Data Anal. 21, 241–258 (2015)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

The authors wish to thank the Editor-in-Chief, Dr. Zhiming Ma, the Associate Editor and two reviewers for their many helpful and insightful comments and suggestions that greatly improved the paper. The research was partially supported by Grants from the Natural Science Foundation of China [Grant Number 11731011], a grant from key project of the Yunnan Province Foundation, China [Grant Number 202001BB050049].

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jianguo Sun.

Appendix: Proofs of the Asymptotic Properties of \(\hat{\theta }_n\)

Appendix: Proofs of the Asymptotic Properties of \(\hat{\theta }_n\)

In this Appendix, we will sketch the proof of Theorems 4.1, 4.2 and 4.3. For this, we will mainly use some results about empirical processes given in van der Vaart and Wellner [26].

Proof of Theorem 4.1

To prove the consistency, we will verify the conditions of Theorem 5.7 of Van der Vaart [25]. First we will verify the condition \(J_1=: \text{ lim}_{n}\text{ sup}_{\theta _n\in \Theta _n}|\text{ P}_n l(\theta , O)-\text{ P }l(\theta , O)|=o_p(1).\) Note that

$$\begin{aligned} J_1\leqslant \text{ lim}_n \text{ sup}_{\theta _n\in \Theta _n}|&\text{ P}_nl(\theta , O)-\text{ P }l(\theta _n,O)|+\text{ lim}_{n}\text{ sup}_{\theta _n\in \Theta _n}|\text{ P }l(\theta _n,O)-\text{ P }l(\theta ,O)|\\&=:J_{11}+J_{12}. \end{aligned}$$

Therefore, it is sufficient to prove that \(J_{1k}=o_p(1), k=1,2\). To prove that \(J_{11}=o_p(1)\), we just need verify that \(\varepsilon =\{l(\theta _n, O), \theta _n\in \Theta _n\}\) is Euclidean class for its envelope function \(\text{ max}_{\theta _n\in \Theta _n}l(\theta _n, O)\). According to (A1), (A2) and Lemma 2.14 in Pakes and Pollard [20], it is easy to see that class \(\varepsilon \) is a Euclidean class. Hence, we have \(J_{11}=o_p(1)\). For \(J_{12}\), by Lemman A1 of Lu et al. [17] and contiguous property of log-likelihood function, we have \(J_{12}=o(1)\). Thus, we could obtain that condition \(J_1=: \text{ lim}_{n}\text{ sup}_{\theta _n\in \Theta _n}|\text{ P}_n l(\theta , O)-\text{ P }l(\theta , O)|=o_p(1)\) holds.

Now, we verify that another condition of Theorem 5.7 of Van der Vaart [25] holds. That is, for any \(\epsilon \),

$$\begin{aligned} \text{ sup}_{d(\theta , \theta _0)>\epsilon }{\mathbb {P}}l(\theta , O)<{\mathbb {P}}l(\theta _0, O). \end{aligned}$$

Note that this condition is satisfied according to condition (A4). Now, by Theorem 5.7 of Van der Vaart [25], we have \(d(\hat{\theta }_n, \theta _0)=o_p(1)\), which completes the proof of Theorem 4.1. \(\square \)

Proof of Theorem 4.2

To derive the convergence rate, for any \(\omega >0\), define the class \({\mathcal {F}}_w=\{l(\theta _{n0}, O)-l(\theta , O): \theta \in \Theta _n, d(\theta , \theta _{n0})\leqslant w\}\) with \(\theta _{n0}=(\beta _0,\gamma _0,\Sigma _0, \Lambda _{1n0}, \ldots , \Lambda _{Kn0}, \Lambda _{cn0})\). Following the calculation of Shen and Wong (1994, P.597), we can establish that \(\text{ log }N_{[]}(\epsilon , {\mathcal {F}}_{\omega }, \parallel .\parallel _{2})\leqslant CN \text{ log }(\omega /\epsilon )\) with \(N=2(s+k_n)\), where \(N_{[]}(\epsilon , {\mathcal {F}}_{\omega }, d)\) denotes the bracketing number (see Definition 2.1.6 in [26]) with respect to the metric or semi-metric d of a function class \( {\mathcal {F}}\). Moreover, some algebraic calculations lead to \(\parallel l(\theta _{n0},O)-l(\theta , O)\parallel ^2\leqslant C\omega ^2\) for any \(l(\theta _{n0},O)-l(\theta , O)\in {\mathcal {F}}_\omega \). Then Lemma 19.36 of van der Vaart [25] gives

$$\begin{aligned} E^{*}\text{ sup}_{d(\theta , \theta _0)<\omega }\parallel \sqrt{n}({\mathbb {P}}_n-{\mathbb {P}})(l(\theta , O)-l(\theta _0, O))\parallel =O(1)\omega ^{1/2}(1+\frac{\omega ^{1/2}}{\epsilon ^2\sqrt{n}}M_1), \end{aligned}$$

where \(E^{*}\) is the outer expectation and \(M_1\) is a positive constant. Let \(\phi _n(\omega )=\omega ^{1/2}(1+\frac{\omega ^{1/2}}{\epsilon ^2\sqrt{n}}M_1)\). Then \(\phi _n(\omega )/\omega \) is a decreasing function, and \(n^{\frac{2}{3}}\phi _n(n^{\frac{-1}{3}})=O(\sqrt{n})\) for large n. Furthermore, by Theorem 4.1, we know that \(\hat{\theta }_n\) is consistent. According to theorem 3.4.1 of van der Vaart and Wellner [26], we can conclude that \(d(\hat{\theta }_n, \theta _0)=\{ \parallel \hat{\zeta }_n-\zeta _0 \parallel ^{2}+\parallel \hat{\Lambda }_{cn}(c)-\Lambda _{c0}(c)\parallel ^{2}+\sum \nolimits _{k=1}^{K}\int [\hat{\Lambda }_{kn}(c)-\Lambda _{k0}(c)]^{2}f_k(c)\mathrm{d}c\}^{\frac{1}{2}}=O_p(n^{-1/3})\), which completes the proof of Theorem 4.2. \(\square \)

Proof of Theorem 4.3

The score functions for \(\beta \) and \(\gamma \) are denoted by \(S_{\beta }(\theta )\) and \(S_\gamma (\theta )\), respectively, where \(S_\beta (\theta )=\frac{\partial l(\beta , \gamma , \Sigma , {\mathcal {A}})}{\partial \beta }\) and \(S_\gamma (\theta )=\frac{\partial l(\beta , \gamma , \Sigma , {\mathcal {A}})}{\partial \gamma }\). For \(k = 1, \ldots , K\), we let \(h_k(t)\) be a nonnegative and nondecreasing function on \([\tau _1, \tau _2]\). Define \(H =\{h = (h_1(t), \ldots , h_K(t))\}.\) Consider parametric submodels \(\Lambda _\epsilon (t)=(\Lambda _{1, \epsilon }(t), \ldots , \Lambda _{K, \epsilon }(t))\), where \(\Lambda _{k, \epsilon }(t)=\Lambda _k(t)+\epsilon h_k(t)\). For each k, the score function along the kth submodels is given by \(S_{\Lambda _k(\theta )}[h(k)]=\frac{\partial l(\beta , \gamma , \Sigma , \Lambda _k, \epsilon )}{\partial \epsilon }|_{\epsilon =0}\). The efficient score for \(\zeta \) at \((\zeta _0, {\mathcal {A}}_0)\) is \({\tilde{l}}(\zeta _0, {\mathcal {A}}_0)=S_{\zeta }(\zeta _0, {\mathcal {A}}_0)-\sum \nolimits _{k=1}^{K}S_{\Lambda _{k}}(\zeta _0, {\mathcal {A}}_0)[h_k^{*}]\), \(h_k^{*}\)is a \((d + 1)\)-vector function satisfying

$$\begin{aligned} {\mathbb {P}}[(S_{\zeta }(\zeta _0, {\mathcal {A}}_0)-\sum \limits _{k=1}^{K}S_{\Lambda _{k}}(\zeta _0, {\mathcal {A}}_0)[h_k^{*}])'(\sum \limits _{k=1}^{K}S_{\Lambda _{k}}(\zeta _0, {\mathcal {A}}_0)[h_k^{*}])=0, \end{aligned}$$

for each \(h_k\) in H. By following similar calculations in Section 3 of Chang et al. [1], we can establish the existence of \(h_k\) in the above equation.

The efficient Fisher information matrix \(I_0\) for \(\zeta \) at \((\zeta _0, {\mathcal {A}}_0)\) is defined as \({\mathbb {P}}({\tilde{l}}(\zeta _0, {\mathcal {A}}_0){\tilde{l}}'(\zeta _0, {\mathcal {A}}_0)).\) By Taylor expansion, we can obtain

$$\begin{aligned}&{\mathbb {P}}{\tilde{l}}(\zeta _0, {\mathcal {A}})={\mathbb {P}}{\tilde{l}}(\zeta _0, {\mathcal {A}}_0)\\&\quad +{\mathbb {P}}\{\sum \limits _{k=1}^{k}S_{\zeta , k}(\theta )[\Lambda _k-\Lambda _{k0}]-\sum \limits _{k=1}^{K}\sum \limits _{j=1}^{K}S_{\zeta , j}(\theta )[h_k^{*}, \Lambda _k-\Lambda _{k0}]\}+O_p(\sum \limits _{k=1}^{K}\parallel \Lambda _k-\Lambda _{k0}\parallel ^{2}). \end{aligned}$$

Note that \({\mathbb {P}}{\tilde{l}}(\zeta _0, {\mathcal {A}}_0)=0\), \({\mathbb {P}}(S_{\zeta }(\theta )S_{\Lambda _k}(\theta )[h_k])=-{\mathbb {P}}(S_{\zeta , k}(\theta )[h_k])\), \({\mathbb {P}}(S_{\Lambda _k}(\theta )[{\tilde{h}}_k]S_{\Lambda _j}(\theta )[h_j])=-{\mathbb {P}}(S_{k, j}(\zeta )[{\tilde{h}}_k, h_j]),\) by the consistency and the convergence rate of \({\hat{\Lambda }}_n\), we can conclude that \({\mathbb {P}}{\tilde{l}}(\zeta _0, {\hat{\mathcal {A}}}_n)=O_p(n^{-2/3})\). Therefore, \(\sqrt{n}({\mathbb {P}}_n-{\mathbb {P}})({\tilde{l}}({\hat{\zeta }}_n, {\hat{\mathcal {A}}}_n)- {\tilde{l}}({\hat{\zeta }}_0, {\hat{\mathcal {A}}}_0))=o_p(1)\). Due to the fact that \({\mathbb {P}}_n{\tilde{l}}({\hat{\theta }}_n)={\mathbb {P}}{\tilde{l}}(\theta _0)=0\) and \({\mathbb {P}}{\tilde{l}}(\zeta _0, {\hat{\mathcal {A}}})=o_p(1)\), we have

$$\begin{aligned} -\sqrt{n}{\mathbb {P}}({\tilde{l}}({\hat{\theta }}_n)-{\tilde{l}}(\zeta _0, {\hat{\mathcal {A}}})=\sqrt{n}{\mathbb {P}}_n({\tilde{l}}(\theta _0))+o_p(1). \end{aligned}$$

By the mean value theorem, we have

$$\begin{aligned} -\sqrt{n}{\mathbb {P}}\frac{\partial {\tilde{l}}(\zeta ', {\hat{\mathcal {A}}}_n)}{\partial \zeta }(\hat{\zeta }_n-\zeta _0)=\sqrt{n}{\mathbb {P}}_n({\tilde{l}}(\theta _0))+o_p(1), \end{aligned}$$

where \(\zeta '\) is a point between \(\hat{\zeta }_n\) and \(\zeta _0\). Since \(\hat{\theta }_n\) is consistency and \({\mathbb {P}}(-\frac{\partial {\tilde{l}}(\theta _0)}{\partial \zeta }={\mathbb {P}}({\tilde{l}}(\theta _0){\tilde{l}}'(\theta _0))=I_0\), we can conclude that

$$\begin{aligned} \sqrt{n}(\hat{\zeta }_n-\zeta _0)=I_0^{-1}\sqrt{n}{\mathbb {P}}({\tilde{l}}(\theta _0)+o_p(1){\mathop {\rightarrow }\limits ^{d}}N(0, I_0^{-1}). \end{aligned}$$

This completes the proof of Theorem 4.3. \(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, H., Ma, C., Sun, J. et al. A New Approach for Regression Analysis of Multivariate Current Status Data with Informative Censoring. Commun. Math. Stat. 11, 775–794 (2023). https://doi.org/10.1007/s40304-021-00274-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s40304-021-00274-3

Keywords

Mathematics Subject Classification

Navigation