Abstract
We consider the semiparametric regression of panel count data occurring in longitudinal follow-up studies that concern occurrence rate of certain recurrent events. The analysis of panel count data involves two processes, i.e, a recurrent event process of interest and an observation process controlling observation times. However, the model assumptions of existing methods, such as independent censoring time and Poisson assumption, are restrictive and questionable. In this paper, we propose new joint models for panel count data by considering both informative observation times and censoring times. The asymptotic normality of the proposed estimators are established. Numerical results from simulation studies and a real data example show the advantage of the proposed method.
Similar content being viewed by others
References
Andrews DF, Agnes MH (2012) Data: a collection of problems from many fields for the student and research worker. Springer, Berlin
Cai J, Schaubel DE (2004) Analysis of recurrent event data. Handbook of statistics 23:603–623
Diggle PJ, Liang KY, Zeger SL (1994) The analysis of longitudinal data. Oxford University Press, Oxford
He X, Tong X, Sun J (2009) Semiparametric analysis of panel count data with correlated observation and follow-up times. Lifetime Data Anal 15:177–196
Hu XJ, Sun J, Wei LJ (2003) Regression parameter estimation from panel counts. Scand J Stat 30:25–43
Huang CY, Wang MC (2004) Joint modeling and estimation for recurrent event processes and failure time data. J Am Stat Assoc 99:1153–1165
Huang CY, Wang MC, Zhang Y (2006) Analysing panel count data with informative observation times. Biometrika 93:763–775
Jiang B, Li J, Fine J (2018) On two-step residual inclusion estimator for instrument variable additive hazards model. Biostat Epidemiol 2:47–60
Lagakos SW (1979) General right censoring and its impact on the analysis of survival data. Biometrics 35:139–156
Li Y, He X, Wang H, Zhang B, Sun J (2015) Semiparametric regression of multivariate panel count data with informative observation times. J Multivar Anal 140:209–219
Liang Y, Lu W, Ying Z (2009) Joint modeling and analysis of longitudinal data with informative observation times. Biometrics 65:377–384
Lin DY, Ying Z (1997) Additive hazards regression models for survival data. In: Proceedings of the first seattle symposium in biostatitics: survival analysis. Spinger
Lin DY, Ying Z (2001) Semiparametric and nonparametric regression analysis of longitudinal data. J Am Stat Assoc 96:103–126
Lin DY, Wei LJ, Yang I, Ying Z (2000) Semiparametric regression for the mean and rate function of recurrent events. J R Stat Soc B 69:711–730
Liu K, Zhao X (2015) Robust estimation for longitudinal data with informative observation times. Can J Stat 43:519–533
Lu M, Zhang Y, Huang J (2007) Estimation of the mean function with panel count data using monotone polynomial splines. Biometrika 94:1–14
Song X, Mu X, Sun L (2012) Regression analysis of longitudinal data with time-dependent covariates and informative observation times. Scand J Stat 39:248–258
Sun J, Wei LJ (2000) Regression analysis of panel count data with covariate-dependent observation and censoring times. J R Stat Soc Ser B 62:293–302
Sun J, Zhao X (2013) Statistical analysis of panel count data. Springer, New York
Sun J, Tong X, He X (2007) Regression analysis of panel count data with dependent observation times. Biometrics 63:1053–1059
Wellner JA, Zhang Y (2000) Two estimators of the mean of a counting process with panel count data. Ann Stat 28:779–814
Wellner JA, Zhang Y (2007) Two likelihood-based semiparametric estimation methods for panel count data with covariates. Ann Stat 35:2106–2142
Zhang Y (2002) A semiparametric pseudolikelihood estimation method for panel count data. Biometrika 89:39–48
Zhao X, Tong X (2011) Semiparametric regression analysis of panel count data with information observation times. Comput Stat Data Anal 55:291–300
Zhao X, Tong X, Sun J (2013) Robust estimation for panel count data with informative observation times. Comput Stat Data Anal 57:33–40
Zhu L, Sun J, Tong X, Pounds S (2011) Regression analysis of longitudinal data with informative observation times and application to medical cost data. Stat Med 30:1429–1440
Acknowledgements
The authors would like to thank the Editor, the Associate Editor and the two reviewers for their constructive and insightful comments and suggestions that greatly improved the paper. This research is partly supported by the Research Grant Council of Hong Kong (15301218), the National Natural Science Foundation of China (No. 11771366), and The Hong Kong Polytechnic University.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
No potential conflict of interest was reported by the authors.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix: Proofs of asymptotics
Appendix: Proofs of asymptotics
To establish the consistency and asymptotic normality of \(\hat{\eta }\) and \(\hat{\beta }\), we assume that \(\mathbf X _i(t)'s\) are of bounded variations. Furthermore, we assume that, as \(n\rightarrow +\infty \),
Define
and
It is easy to show that \(dM_i(t)\) and \(dR_i(t)\) are mean 0 stochastic process. Also, define \(\bar{\mathbf{x }}(t)=\frac{s_4(t;\eta _0)}{s_1(t;\eta _0)}\), \(\bar{\mathbf{x }}^*(t)=\frac{s_3(t;\eta _0)}{s_1(t;\eta _0)}\).
1.1 A.1 Proof of consistency and asymptotic normality of \(\hat{\eta } =(\hat{\gamma }, \hat{\xi })\)
Recall that \(\hat{\eta }\) is the solution to \(V(\eta )=0\) where
By the Taylor series expansion, we have
where \(\eta ^* \in (\eta _0, \hat{\eta })\). Since,
thus, we have,
Define \(P_n=-\frac{1}{n}\frac{\partial V(\eta )}{\partial \eta }|_{\eta =\eta _0}\). Then we have,
Since \(\frac{\partial ^2 V(\eta )}{\partial \eta \partial \eta '}|_{\eta =\eta ^*}\) is bounded in probability. Furthermore, following the arguments similar to those given in Appendix 2 of Lin and Ying (2001), we have,
a sum of n independent mean 0 random vectors plus an asymptotically negligible term. Thus, by the multivariate CLT, \(\frac{1}{\sqrt{n}}V(\eta _0)\rightarrow N(0,\varSigma _{\eta })\), where \(\varSigma _{\eta }=E[\int _{0}^{\tau } R(t)\{\mathbf{X }_1^*(t)-\bar{\mathbf{x }}^*(t)\}dM_1(t)]^{\otimes 2}\).
Finally, we have \(\sqrt{n}(\hat{\eta }-\eta _0)\rightarrow P^{-1}N(0,\varSigma _{\eta })\), as \(n\rightarrow +\infty \), where
It is easy to know that \({\hat{P}}\) and \({\hat{\varSigma }}_{\eta }\) is the consistent estimator of P and \(\varSigma _{\eta }\), respectively.
1.2 A.2 Proof of consistency and asymptotic normality of \(\hat{\beta }\)
Recall that \(\hat{\beta }\) is the solution to \(U(\beta ;\hat{\eta })=0\) where
By the Taylor series expansion, we have
where \(\frac{\partial U(\beta ;\hat{\eta })}{\partial \beta }|_{\beta =\beta _0}=-\sum _{i=1}^n\int _{0}^{\tau }W(t)\{\mathbf{X }_i-\bar{\mathbf{X }}(t;\hat{\eta })\}N_i(t)\exp (-\beta _0'{} \mathbf X _i){} \mathbf X _i'\varDelta _i(t)dO_i(t)\).
Define \(Q_n=-\frac{1}{n}\frac{\partial U(\beta ;\hat{\eta })}{\partial \beta }|_{\beta =\beta _0}\). Then we have \(\sqrt{n}(\hat{\beta }-\beta _0)=Q_n^{-1}\frac{1}{\sqrt{n}}U(\beta _0;\hat{\eta })+o_p(1)\), since \(\frac{\partial ^2 U(\beta ,\hat{\eta })}{\partial \beta \partial \beta '}|_{\beta =\beta ^*}\) is bounded. Therefore, it is sufficient to find the limit distribution of \(\frac{1}{\sqrt{n}}U(\beta _0;\hat{\eta })\). Again, by the Taylor series expansion, we have
where \(\eta ^* \in (\eta _0,\hat{\eta })\).
As \(\hat{\eta }\) is a consistent estimator of \(\eta \) as showed in previous section and \(\frac{\partial ^2 U(\beta _0;\eta )}{\partial \eta \partial \eta '}|_{\eta =\eta ^*}\) is bounded, we have
Similar to the arguments of \(V(\eta _0)\), we have
a sum of n independent mean 0 random vectors plus an asymptotically negligible term.
Now, we have
where \(T_n=\frac{1}{n}\frac{\partial U(\beta _0;\eta )}{\partial \eta }|_{\eta =\eta _0}, M_n=T_nP_n^{-1}, W_i(M_n)=U_i+M_nV_i\). Thus, by the multivariate CLT, we have \(\sqrt{n}(\hat{\beta }-\beta _0)\rightarrow Q^{-1}N(0,\varSigma _{\beta })\), where \(\varSigma _{\beta }=EW_1(M)^{\otimes 2}\), \(M=\lim _{n\rightarrow +\infty }M_n\), and
Next we compute M. Note that
Since
we have
Thus, we have \(M=TP^{-1}\), where
It is easy to show that \({\hat{Q}}\), \({\hat{\varSigma }}_{\beta }\) and \({\hat{M}}\) are the consistent estimators of \(Q, \varSigma _{\beta }\) and M, respectively.
Rights and permissions
About this article
Cite this article
Jiang, H., Su, W. & Zhao, X. Robust estimation for panel count data with informative observation times and censoring times. Lifetime Data Anal 26, 65–84 (2020). https://doi.org/10.1007/s10985-018-09457-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10985-018-09457-7