Abstract
Longitudinal data occur in many fields such as the medical follow-up studies that involve repeated measurements. For their analysis, most existing approaches assume that the observation or follow-up times are independent of the response process either completely or given some covariates. In practice, it is apparent that this may not be true. In this paper, we present a joint analysis approach that allows the possible mutual correlations that can be characterized by time-dependent random effects. Estimating equations are developed for the parameter estimation and the resulted estimators are shown to be consistent and asymptotically normal. The finite sample performance of the proposed estimators is assessed through a simulation study and an illustrative example from a skin cancer study is provided.
Similar content being viewed by others
References
Cheng SC, Wei LJ (2000) Inferences for a semiparametric model with panel data. Biometrika 87:89–97
Ghosh D, Lin DY (2002) Marginal regression models for recurrent and terminal events. Stat Sin 12:663–688
He X, Tong X, Sun J (2009) Semiparametric analysis of panel count data with correlated observation and follow-up times. Lifetime Data Anal 15:177–196
Hu XJ, Sun J, Wei LJ (2003) Regression parameter estimation from panel counts. Scand J Stat 30:25–43
Huang CY, Wang MC, Zhang Y (2006) Analysing panel count data with informative observation times. Biometrika 93:763–775
Kalbfleisch JD, Prentice RL (2002) The statistical analysis of failure time data. Wiley, New York
Li N, Zhao H, Sun J (2013) Semiparametric transformation models for panel count data with correlated observation and follow-up times. Stat Med 32(17):3039–3054
Li Y, Zhao H, Sun J, Kim KM (2014) Nonparametric tests for panel count data with unequal observation processes. Comput Stat Data Anal 73:103–111
Lin DY, Ying Z (2001) Semiparametric and nonparametric regression analysis of longitudinal data (with discussion). J Am Stat Assoc 96(453):103–113
Lin DY, Wei LJ, Ying Z (1993) Checking the Cox model with cumulative sums of martingale-based residuals. Biometrika 80:557–572
Lin DY, Oaks D, Ying Z (1998) Additive hazards regression with current status data. Biometirka 85(2):289–298
Lin DY, Wei LJ, Yang I, Ying Z (2000) Semiparametric regression for the mean and rate functions of recurrent events. J R Stat Soc Ser B 62:711–730
Schoenfeld D (1982) Partial residuals for the proportional hazards regression model. Biometrika 69:239–241
Sun J, Kalbfleisch JD (1995) Estimation of the mean function of point processes based on panel count data. Stat Sin 5:279–289
Sun J, Wei LJ (2000) Regression analysis of panel count data with covariate-dependent observation and censoring times. J R Stat Soc Ser B 62:293–302
Sun J, Zhao X (2013) The statistical analysis of panel count data. Springer, New York
Sun J, Tong X, He X (2007) Regression analysis of panel count data with dependent observation times. Biometrics 63:1053–1059
Sun L, Song X, Zhou J, Liu L (2012) Joint analysis of longitudinal data with informative observation times and a dependent terminal event. J Am Stat Assoc 107(498):688–700
Tong X, Sun L, He X, Sun J (2009) Variable selection for panel count data via non-concave penalized estimating function. Scand J Stat 36:620–635
Wang H, Li Y, Sun J (2014) Focused and model average estimation for regression analysis of panel countdata. Scand J Stat. doi:10.1002/sjos.12133
Wellner JA, Zhang Y (2000) Two estimators of the mean of a counting process with panel count data. Ann Stat 28:779–814
Wellner JA, Zhang Y (2007) Two likelihood-based semiparametric estimation methods for panel count data with covariates. Ann Stat 35:2106–2142
Zhang Y (2002) A semiparametric pseudolikelihood estimation method for panel count data. Biometrika 89:39–48
Zhao X, Tong X (2011) Semiparametric regression analysis of panel count data with informative observation times. Comput Stat Data Anal 55(1):291–300
Zhang Z, Sun J, Sun L (2005) Statistical analysis of current status data with informative observation times. Stat Med 24:1399–1407
Zhao X, Zhou J, Sun L (2011) Semiparametric transformation models with time-varying coefficients for recurrent and terminal events. Biometrics 67:404–414
Zhao H, Li Y, Sun J (2013a) Semiparametric analysis of multivariate panel count data with dependent observation processes and a terminal event. J Nonparametr Stat 25(2):379–394
Zhao X, Tong X, Sun J (2013b) Robust estimation for panel count data with informative observation times. Comput Stat Data Anal 57:33–40
Zhang H, Zhao H, Sun J, Wang D, Kim KM (2013) Regression analysis of multivariate panel count data with an informative observation process. J Multivar Anal 119:71–80
Acknowledgments
The authors wish to thank the editor, the associate editor and the two reviewers for their valuable comments that led to a great improvement of this manuscript. This work was supported, in part, by funds provided by The University of North Carolina at Charlotte (FRG 1-11172) to the first author.
Author information
Authors and Affiliations
Corresponding author
Appendices
Appendix 1: Proof of Theorem 1
To derive the asymptotic properties of the proposed estimators \(\hat{\beta }\) and \(\hat{\eta }\), we need the following regularity conditions analogous to those given by Lin et al. (2000) (Sect. 2):
- (C1):
-
\(\{\widetilde{N}_i(\cdot ), Y_i(\cdot ), C_i, \mathbf{Z}_i\}_{i=1}^n\) are independent and identically distributed.
- (C2):
-
There exists a \(\tau >0\) such that \(P(C_i\ge \tau )>0\).
- (C3):
-
Both \(\widetilde{N}_i(t)\) and \( Y_i(t)\) (\(0 \le t \le \tau , i=1,\ldots ,n\)) are bounded.
- (C4):
-
W(t) and \(\mathbf{Z}_i,\, i=1,\dots ,n\), have bounded variations and W(t) converges almost surely to a deterministic function w(t) uniformly in \(t\in [0,\tau ]\).
- (C5):
-
\(A_{\beta }=E\{\int _0^{\tau } w(t) e^{\beta '_0 \mathbf{Z}_i+\eta _0' \mathbf{X}_i(t)} [\mathbf{Z}_i-e_z(t)]^{\otimes 2}d{\varLambda }^*_2(t)\}\) and \({\varOmega }_\eta =E\Big [\int _0^{\tau }\big \{\mathbf{X}_i(t)-\bar{x}(t)\big \}^{\otimes 2} e^{\eta _0' \mathbf{X}_i(t)}d{\varLambda }^*_1(t)\Big ]\) are both positive definite.
Under condition (C2), we define
which is integrable under conditions (C3) and (C4). Also note that \(d\widehat{{\varLambda }}^*_2(t)\) satisfies
Let
and under (C1), let
The consistency of \(\hat{\beta }\) and \(\hat{\eta }\) follows from the facts that \(U_1(\beta _0;\hat{\eta })\) and \(U_{\eta }(\eta _0)\) both tend to 0 in probability as \(n\rightarrow \infty \), and that under condition (C5), \(\widehat{A}_\beta (\beta )\) and \(-n^{-1} \partial U_{\eta }(\eta ) / \partial \eta '\) both converge uniformly to the positive definite matrices \(A_{\beta }\) and \({\varOmega }_{\eta }\) over \(\beta \) and \(\eta \), respectively, in neighborhoods around the true values \(\beta _0\) and \(\eta _0\). Then the Taylor series expansions of \(U_1(\hat{\beta };\hat{\eta })\) at \((\beta _0;\hat{\eta })\) and \((\beta _0,\eta _0)\) yield \(n^{1/2}(\hat{\beta }-\beta _0)=A_\beta ^{-1}n^{-1/2}U_1(\beta _0;\hat{\eta })+o_p(1)= A_\beta ^{-1}\Big \{n^{-1/2}U_1(\beta _0;\eta _0)-A_\eta n^{1/2} (\hat{\eta }-\eta _0)\Big \}+o_p(1)\). The proof of Theorem 1 is sketched as follows:
-
(1)
First, using some derivation operation to \(U_1(\beta ;\hat{\eta })\) and (16), we can get
$$\begin{aligned} \widehat{A}_\beta (\beta )=n^{-1}\sum _{i=1}^n\int _0^{\tau }W(t)\big \{\mathbf{Z}_i-\widehat{E}_Z(t;\beta ,\hat{\eta })\big \}^{\otimes 2}e^{\beta '\mathbf{Z}_i+\hat{\eta }^{\prime }{}\mathbf{X}_i(t)}d\widehat{{\varLambda }}^*_2(t;\beta ,\hat{\eta }). \end{aligned}$$ -
(2)
Solving \(d\widehat{{\varLambda }}^*_2(t;\beta _0,\eta _0)\) from (16) and applying to \(U_1(\beta _0;\eta _0)\) yields
$$\begin{aligned} U_1(\beta _0;\eta _0)=\sum _{i=1}^n\int _0^{\tau }w(t)\Big (\mathbf{Z}_i-e_z(t)\Big )dM_i(t)+o_p(n^{1/2}), \end{aligned}$$where \(e_z(t)=lim_{n\rightarrow \infty }\widehat{E}_Z(t;\beta _0,\eta _0)\) as defined earlier in Sect. 3 and w(t) is a deterministic function defined under (C5).
-
(3)
Differentiation of \(U_1(\beta _0, \eta )\) and (16) with respect to \(\eta \) yields
$$\begin{aligned} \widehat{A}_\eta (\eta )=n^{-1}\sum _{i=1}^n\int _0^{\tau }W(t)\big [\mathbf{Z}_i-\widehat{E}_Z(t;\beta _0,\eta )\big ] e^{\beta _0'\mathbf{Z}_i+\eta '\mathbf{X}_i(t)}X'_i(t) d\widehat{{\varLambda }}^*_2(t;\beta _0,\eta ). \end{aligned}$$ -
(4)
According to equation (5) and by using the asymptotic results in Lin et al. (2000) (A.5), one can show that
$$\begin{aligned} n^{1/2}\{\hat{\eta }\!-\!\eta _0\}\!=\!{\varOmega }_\eta ^{-1}n^{-1/2}\sum _{i=1}^n\bigg [ \int _0^{\tau }\Big (\mathbf{X}_i(t)-\frac{s^{(1)}(t)}{s^{(0)}(t)}\Big ) d M^*_i(t)\bigg ] +\, o_p(1),\quad \end{aligned}$$(17)where \({\varOmega }_\eta =E\Big [\int _0^{\tau }\big \{\mathbf{X}_i(t)-\bar{x}(t)\big \}^{\otimes 2} e^{\eta _0'\mathbf{X}_i(t)}d{\varLambda }^*_1(t)\Big ]\), which is invertible under (C5).
Combining the results in steps (1)–(4), we have
Since \(A_{\beta }\) is also invertible under (C5), it then follows from the multivariate central limit theorem that the conclusions hold.
Appendix 2: Proof of the null distribution of \({\mathcal {F}}(t,z)\) in Section 3
Let \(V(\hat{\beta },\hat{\eta })=\sum _{i=1}^n\int _0^t I(\mathbf{Z}_i\le z)d\widehat{M}_i(s; \hat{\beta },\hat{\eta })\), then by the Taylor expansion,
Using the arguments and algebra manipulation similar to those in Appendix 1, we have \(V(\beta _0,\eta _0)=\sum _{i=1}^n u_{1i}(t,z)+o_p(n^{1/2})\), where \( u_{1i}(t,z)=\int _0^t \{I(\mathbf{Z}_i\le z)-e_I(s,z)\}dM_i(s)\). Also, \(\frac{\partial V(\beta _0,\eta _0)}{n\partial \eta '}\) and \(\frac{\partial V(\beta _0,\hat{\eta })}{n\partial \beta '}\) can be estimated by \(-\widehat{{\varPhi }}_\eta (t,z)\) and \(-\widehat{{\varPhi }}_\beta (t,z)\), respectively.
In addition, it follows from (17) and Theorem 1 that
and
where \(v_{1i}=\int _0^{\tau }w(t)\big (\mathbf{Z}_i- e_z(t)\big )d M_i(t)\), and \(v_{2i}=\int _0^{\tau } A_\eta {\varOmega }_\eta ^{-1}\big (\mathbf{X}_i(t)- \bar{x}(t)\big ) d M^*_i(t) \,\). Therefore, \({\mathcal {F}}(t,z;\hat{\beta },\hat{\eta })\) can be expressed as a sum of i.i.d. mean-zero terms for fixed t. By the multivariate central limit theorem, \({\mathcal {F}}(t,z)\) converges in finite-dimensional distribution to a mean-zero Gaussian process. Since \({\mathcal {F}}(t,z)\) is tight based on the empirical process theory, \({\mathcal {F}}(t,z)\) converges weakly to a mean-zero Gaussian process that can be approximated by \(\widehat{\mathcal {F}}(t,z)\) given by Eq. (10).
Rights and permissions
About this article
Cite this article
Li, Y., He, X., Wang, H. et al. Regression analysis of longitudinal data with correlated censoring and observation times. Lifetime Data Anal 22, 343–362 (2016). https://doi.org/10.1007/s10985-015-9334-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10985-015-9334-z