Skip to main content

Semiparametric analysis of multivariate panel count data with nonlinear interactions

Abstract

Multivariate panel count data frequently arise in follow up studies involving several related types of recurrent events. For univariate panel count data, several varying coefficient models have been developed. However, varying coefficient models for multivariate panel count data remain to be studied. In this paper, we propose a varying coefficient mean model for multivariate panel count data to describe the possible nonlinear interact effects between the covariates and the local logarithm partial likelihood procedure is considered to estimate the unknown covariate effects. Furthermore, a Breslow-type estimator is constructed for the baseline mean functions. The consistency and asymptotic normality of the proposed estimators are established under some mild conditions. The utility of the proposed approach is evaluated by some numerical simulations and an application to a dataset of skin cancer study.

This is a preview of subscription content, access via your institution.

References

  1. Cai J, Fan J, Zhou H, Zhou Y (2007) Hazard models with varying coefficients for multivariate failure time data. Ann Statist 35(1):324–354

    MathSciNet  Article  Google Scholar 

  2. Carroll RJ, Fan J, Gijbels I, Wand MP (1997) Generalized partially linear single-index models. J Am Statist Assoc 92(438):477–489

    MathSciNet  Article  Google Scholar 

  3. Fan J, Gijbels I, King M (1997) Local likelihood and local partial likelihood in hazard regression. Ann Statist 25(4):1661–1690

    MathSciNet  Article  Google Scholar 

  4. He X, Feng X, Tong X, Zhao X (2017) Semiparametric partially linear varying coefficient models with panel count data. Lifetime Data Anal 23(3):439–466

    MathSciNet  Article  Google Scholar 

  5. He X, Tong X, Sun J (2009) Semiparametric analysis of panel count data with correlated observation and follow-up times. Lifetime Data Anal 15(2):177

    MathSciNet  Article  Google Scholar 

  6. He X, Tong X, Sun J, Cook RJ (2008) Regression analysis of multivariate panel count data. Biostatistics 9(2):234–248

    Article  Google Scholar 

  7. Hu XJ, Sun J, WEI LJ (2003) Regression parameter estimation from panel counts. Scandinavian J Statist 30(1):25–43

    MathSciNet  Article  Google Scholar 

  8. Li N, Park D, Sun J, Kim K (2011) Semiparametric transformation models for multivariate panel count data with dependent observation process. Can J Statist 39(3):458–474

    MathSciNet  MATH  Google Scholar 

  9. Li Y, He X, Wang H, Zhang B, Sun J (2015) Semiparametric regression of multivariate panel count data with informative observation times. J Multivariate Anal 140:209–219

    MathSciNet  Article  Google Scholar 

  10. Lu M, Zhang Y (2007) Estimation of the mean function with panel count data using monotone polynomial splines. Biometrika 94(3):705–718

    MathSciNet  Article  Google Scholar 

  11. Lu M, Zhang Y, Huang J (2009) Semiparametric estimation methods for panel count data using monotone b-splines. J Am Statist Assoc 104(487):1060–1070

    MathSciNet  Article  Google Scholar 

  12. Sun J, Zhao X (2013) Statistical Analysis of Panel Count Data. Springer, New York

    Book  Google Scholar 

  13. Tibshirani R, Hastie T (1987) Local likelihood estimation. J Am Statist Assoc 82(398):559–567

    MathSciNet  Article  Google Scholar 

  14. Wang Y, Yu Z (2019) A kernel regression model for panel count data with time-varying coefficients. arXiv: Statistics Theory

  15. Wang Y, Yu Z (2021) A kernel regression model for panel count data with nonparametric covariate functions. Biometrics. https://doi.org/10.1111/biom.13440

  16. Zhang H, Zhao H, Sun J, Wang D, Kim K (2013) Regression analysis of multivariate panel count data with an informative observation process. J Multivariate Anal 119:71–80

    MathSciNet  Article  Google Scholar 

  17. Zhao H, Li Y, Sun J (2013a) Analyzing panel count data with a dependent observation process and a terminal event. Can J Statist 41(1):174–191

    MathSciNet  Article  Google Scholar 

  18. Zhao H, Li Y, Sun J (2013b) Semiparametric analysis of multivariate panel count data with dependent observation processes and a terminal event. J Nonparametr Statist 25(2):379–394

    MathSciNet  Article  Google Scholar 

  19. Zhao H, Tu W, Yu Z (2018) A nonparametric time-varying coefficient model for panel count data. J Nonparametr Statist 30(3):640–661

    MathSciNet  Article  Google Scholar 

  20. Zhao H, Virkler K, Sun J (2014) Nonparametric comparison for multivariate panel count data. Commun Statist Theory Methods 43(3):644–655

    MathSciNet  Article  Google Scholar 

  21. Zhao X, Tong X (2011) Semiparametric regression analysis of panel count data with informative observation times. Comput Statist Data Anal 55(1):291–300

    MathSciNet  Article  Google Scholar 

  22. Zhao X, Tong X, Sun J (2013) Robust estimation for panel count data with informative observation times. Comput Statist Data Anal 57(1):33–40

    MathSciNet  Article  Google Scholar 

Download references

Acknowledgements

This paper was partially supported by the National Natural Science Foundation of China under Grand No. 12001485 and 12101549, the Ministry of Education Humanities and Social Sciences Research Youth Project of China under Grand No. 21YJCZH153, the National Bureau of Statistics of China under Grand No. 2020LY073 and the Characteristic & Preponderant Discipline of Key Construction Universities in Zhejiang Province (Zhejiang Gongshang University-Statistics).

Author information

Affiliations

Authors

Corresponding author

Correspondence to Yijun Wang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix I: tables and figures

See Tables 1, 2, 3 and Figs. 1, 2, 3, 4, 5, 6, 7 and 8.

Table 1 BIAS, SSE, BSE and 95% CP for the nonparametric parts \(\beta (\cdot )\), \(g'(\cdot )\) and \(g(\cdot )\) on pre-specified grid-points of Case I
Table 2 BIAS, SSE, BSE and 95% CP for the nonparametric parts \(\beta (\cdot )\), \(g'(\cdot )\) and \(g(\cdot )\) on pre-specified grid-points of Case II
Table 3 BIAS, SSE, BSE and 95% CP for the nonparametric parts \(\beta (\cdot )\), \(g'(\cdot )\) and \(g(\cdot )\) on pre-specified grid-points of Case III

Appendix II: proofs of asymptotic properties

The following lemma is needed in the proofs of the theorems, which is similar as Fan et al. (1997) and Cai et al. (2007). The detail proof of this lemma can be found in the paper of Cai et al. (2007).

Lemma 1

Define

$$\begin{aligned} c_{nk}(s,v)=\frac{1}{n}\sum _{i=1}^n Y_i(s)\Psi \{Z_i,V_i,(V_i-v)/h\}K_h(V_i-v)o_{ik}(s), \end{aligned}$$

and

$$\begin{aligned} c_k(s,v)=f(v)\int E\{Y(s)\Psi (Z,V,w)o_{k}(s)|V=v\}K(w) dw, \end{aligned}$$

where \(\Psi (\cdot ,\cdot ,\cdot )\) is continuous for its three arguments and \(E\{\Psi (Z,V,w)|V=v\}\) is continuous at the point v. Suppose conditions (C1) and (C6) hold and \(h\rightarrow 0\), \(nh/\log n\rightarrow \infty \), then we have

$$\begin{aligned} \mathop {sup}\limits _{0\le s\le \tau }\sum _{k=1}^K|c_{nk}(s,v)-c_k(s,v)|\rightarrow _p 0. \end{aligned}$$

Furthermore, we can have

$$\begin{aligned} \mathop {sup}\limits _{0\le s\le \tau }\mathop {sup}\limits _{v\in B}\sum _{k=1}^K|c_{nk}(s,v)-c_k(s,v)|\rightarrow _p 0. \end{aligned}$$

where B is a compact set satisfying \(inf_{v\in B} f(v)>0\).

:

Proof of Theorem 4.1

By the definition of \({{\tilde{N}}_{ik}}(t)\), we can have

$$\begin{aligned} M_{ik}(t)=\int _0^t Y_i(s)\left[ d{\tilde{N}_{ik}}(s)-\exp \{\beta _0(V_i)Z_i+g(V_i)\}\mu _{0k}(s)o_{ik}(s)ds\right] ,0\le t\le \tau , \end{aligned}$$

is a \(\bigcup _{i=1}^n \mathcal {F}_{t,ik}\) martingale, where \(\mathcal {F}_{t,ik}=\sigma \{{{\tilde{N}}_i}(s), Z_i,V_i,Y_i(s),0\le s\le t\}\), \(i=1,\cdots ,n\), \(k=1,\cdots ,K\). Define \(\gamma _0(v)\) be the true values of \(\gamma (v)\). Let \(\zeta (v)=H\{\gamma (v)-\gamma _0(v)\}\), then we have

$$\begin{aligned}&\ell \{\gamma _0(v)+H^{-1}\zeta (v),t\}=\\&\quad \frac{1}{n}\sum _{k=1}^K\sum _{i=1}^n\int _0^t K_h(V_i-v)Y_i(s)\left[ \gamma _0(v)^\top X_i+\zeta (v)^\top X_i^*-\log S_{kh,0}\{s,\zeta (v)\}\right] d{{\tilde{N}}_{ik}}(s), \end{aligned}$$

where \(X_i^*=H^{-1}X_i\) and

$$\begin{aligned} S_{kh,j}(s,\zeta ,v)&=\frac{1}{n}\sum _{i=1}^n K_h(V_i-v) Y_i(s)X_i^{*\otimes j}\exp \{\gamma _0(v)^\top X_i+\zeta (v)^\top X_i^*\}o_{ki}(s),\\&\quad j=0,1,2. \end{aligned}$$

Furthermore, we define

$$\begin{aligned} S^*_{kh,j}&(s,\beta ,g,v)=\frac{1}{n}\sum _{i=1}^n K_h(V_i-v) Y_i(s)X_i^{*\otimes j}\exp \{\beta (V_i) Z_i+g(V_i)\}o_{ki}(s), \end{aligned}$$

and

$$\begin{aligned} s^*_{kh,0}&(s,\beta ,g,v)=f(v)E[Y(s)\exp \{\beta (V) Z+g(V)\}o_{k}(s)|V=v],\\ s^*_{kh,1}&(s,\beta ,g,v)=f(v)E[Y(s)\exp \{\beta (V) Z+g(V)\}(Z,Zu_1^\top ,u_1^\top )^\top o_{k}(s)|V=v],\\ s^*_{kh,2}&(s,\beta ,g,v)=f(v)E[Y(s)\exp \{\beta (V) Z+g(V)\}\Pi (Z)o_{k}(s)|V=v], \end{aligned}$$

where

$$\begin{aligned}\Pi (Z)= \begin{pmatrix} ZZ^\top &{} ZZ^\top u_1^\top &{} Zu_1^\top \\ ZZ^\top u_1 &{}ZZ^\top u_2 &{} Zu_2\\ Zu_1 &{}Z u_2 &{}u_2\\ \end{pmatrix}. \end{aligned}$$

Then,

$$\begin{aligned}&\ell \{\gamma _0(v)+H^{-1}\zeta (v),t\}-\ell \{\gamma _0(v),t\}\\&\quad =\frac{1}{n}\sum _{k=1}^K\sum _{i=1}^n\int _0^t K_h(V_i-v)\left\{ \zeta (v)^\top X_i^*-\log \frac{S_{kh,0}(s,\zeta ,v)}{S_{kh,0}(s,0,v)}\right\} d{M_{ik}}(s)\\&\qquad +\sum _{k=1}^K\int _0^t \left\{ \zeta (v)^\top S^*_{kh,1}(s,\beta _0,g_0,v) -\log \frac{S_{kh,0}(s,\zeta ,v)}{S_{kh,0}(s,0,v)}S^*_{kh,0}(s,\beta _0,g_0,v)\right\} \mu _{0k}(s)ds\\&\quad =\sum _{k=1}^K I_{1k}\{\zeta (v),t\}+I_2\{\zeta (v),t\}. \end{aligned}$$

By Lemma 1, we can easily get that

$$\begin{aligned}&I_2\{\zeta (v),t\}\\&\quad =\sum _{k=1}^K\int _0^t \left[ \zeta (v)^\top s^*_{kh,1}(s,\beta _0,g_0,v) -\log \frac{s_{kh,0}(s,\zeta ,v)}{s_{kh,0}(s,0,v)}s^*_{kh,0}(s,\beta _0,g_0,v)\right] \\&\qquad \mu _{0k}(s)ds+o_p(1)\\&\quad =I^*_2\{\zeta (v),t\}+o_p(1). \end{aligned}$$
Fig. 1
figure1

Estimated curves for \(\beta (\cdot )\) and \(g(\cdot )\) of Case I under \(n=100\) and 200

Fig. 2
figure2

Boxplots of RASEs for \(\beta (\cdot )\) and \(g(\cdot )\) of Case I under \(n=100\) and 200

Fig. 3
figure3

Estimated curves for \(\beta (\cdot )\) and \(g(\cdot )\) of Case II under \(n=100\) and 200

Fig. 4
figure4

Boxplots of RASEs for \(\beta (\cdot )\) and \(g(\cdot )\) of Case II under \(n=100\) and 200

Fig. 5
figure5

Estimated curves for \(\beta (\cdot )\) and \(g(\cdot )\) of Case III under \(n=100\) and 200

Fig. 6
figure6

Boxplots of RASEs for \(\beta (\cdot )\) and \(g(\cdot )\) of Case III under \(n=100\) and 200

Fig. 7
figure7

Estimated curves for \(\mu _{01}(\cdot )\) and \(\mu _{02}(\cdot )\) of Case I under \(n=100\) and 200

Fig. 8
figure8

Estimated curves for \(\mu _{01}(\cdot )\) and \(\mu _{02}(\cdot )\) of Case II under \(n=100\) and 200

Fig. 9
figure9

Estimated curves for \(\mu _{01}(\cdot )\), \(\mu _{02}(\cdot )\) and \(\mu _{03}(\cdot )\) of Case III under \(n=100\) and 200

Fig. 10
figure10

Estimated curves for \(\beta (\cdot )\) and \(g(\cdot )\) of Case IV under \(n=100\) and 200

Fig. 11
figure11

Estimated curves for nonparametric parts of skin cancer study

It can easily shown that the \(I_2\{\zeta (v),t\}\) is strictly concave with respect to \(\zeta (v)\) and it has the maximum value at \(\zeta (v)=0\). Next, we can note \(I_{1k}\{\zeta (v),t\}\) is a local square integrable martingale with the square variation process being

$$\begin{aligned} \langle I_{1k}\{\zeta (v),t\},I_{1k}\{\zeta (v),t\}\rangle =&\frac{1}{n^2}\sum _{i=1}^n\int _0^t K_h^2(V_i-v)Y_i(s)\\&\left\{ \zeta (v)^\top X_i^*-\log \frac{S_{kh,0}(s,\zeta ,v)}{S_{kh,0}(s,0,v)}\right\} ^2\\&\times \exp \{\beta _0(V_i)Z_i+g(V_i)\}\mu _{0k}(s)o_{ik}(s)ds, \end{aligned}$$

and based the Lemma 1, we can have

$$\begin{aligned} E I_{1k}^2\{\zeta (v),t\}=E\langle I_{1k}\{\zeta (v),t\},I_{1k}\{\zeta (v),t\}\rangle =O\left( \frac{1}{nh}\right) \rightarrow 0. \end{aligned}$$

Thus, it implies that \(I_{1k}\{\zeta (v),t\}\rightarrow _p 0\) for \(k=1,\cdots ,K\). Hence,

$$\begin{aligned} \ell \{\gamma _0(v)+H^{-1}\zeta (v),t\}-\ell \{\gamma _0(v),t\}=I^*_2\{\zeta (v),t\}+o_p(1). \end{aligned}$$

Then, we can have that \(\ell \{\gamma _0(v)+H^{-1}\zeta (v),t\}-\ell \{\gamma _0(v),t\}\) is strictly concave with respect to \(\zeta (v)\) and it has the maximum value at \(\zeta (v)=0\). By Lemma A.1 of Carroll et al. (1997), \({\hat{\zeta }}(v)\rightarrow _p 0\). So

$$\begin{aligned} H\{{\hat{\gamma }}(v)-\gamma _0(v)\}\rightarrow _p 0. \end{aligned}$$

Similarly, we can get

$$\begin{aligned} \mathop {sup}\limits _{v\in \Phi _v}|H\{{\hat{\gamma }}(v)-\gamma _0(v)\}|\rightarrow _p 0. \end{aligned}$$

This completes the proof of Theorem 4.1

\(\square \)

:

Proof of Theorem 4.2

Denote

$$\begin{aligned}&\ell '\{\gamma _0(v),\tau \}=\frac{1}{n}\sum _{k=1}^K\sum _{i=1}^n\int _0^\tau K_h(V_i-v)\left\{ X_i^*-\frac{S_{kh,1}(s,0,v)}{S_{kh,0}(s,0,v)}\right\} d{M_{ik}}(s)\\&\qquad +\frac{1}{n}\sum _{k=1}^K\sum _{i=1}^n\int _0^\tau K_h(V_i-v)Y_i(s)\left\{ X_i^*-\frac{S_{kh,1}(s,0,v)}{S_{kh,0}(s,0,v)}\right\} \exp \{\beta _0(V_i)Z_i+g_0(V_i)\}\mu _{0k}(s)o_{ik}(s)ds\\&\quad =I_3(v,\tau )+I_4(v,\tau ). \end{aligned}$$

By the Taylor expansion and Lemma 1, we can have

$$\begin{aligned} I_4(v,\tau )=&\frac{1}{(d+1)!n}\sum _{k=1}^K\sum _{i=1}^n\int _0^\tau K_h(V_i-v)Y_i(s)\left\{ X_i^*-\frac{s^*_{kh,1}(s,\beta _0,g_0,v)}{s^*_{kh,0}(s,\beta _0,g_0,v)}\right\} \\&\times \exp \{\gamma _0(v)^\top X_i^*+g_0(v)\}\{\beta ^{(d+1)}_0(v)Z_i\\&+g^{(d+1)}_0(v)\}(V_i-v)^{d+1}\mu _{0k}(s)o_{ik}(s)ds\{1+O_p(h^d)\}\\ =&\frac{h^{d+1}}{(d+1)!}b[\Gamma (v)^{-1}\beta ^{(d+1)}_0(v), 0,\cdots ,0]^\top \{1+O_p(h^d)\}=A(v,\tau ), \end{aligned}$$

where \(b=\int x^{d+1}K(x)dx\). Besides, we can have

$$\begin{aligned}&I_3(v,\tau )=\frac{1}{n}\sum _{i=1}^n\sum _{k=1}^K\int _0^\tau K_h(V_i-v)\left\{ X_i^*-\frac{s_{kh,1}(s,\zeta ,v)}{s_{kh,0}(s,\zeta ,v)}\right\} d{M_{ik}}(s)+o_p(1)\\&\quad =\frac{1}{n}\sum _{i=1}^n\sum _{k=1}^K A_{ik}(v,\tau )+o_p(1) \end{aligned}$$

It is easily note that \(\sqrt{nh}I_3(v,\tau )\) is a sum of i.i.d. random vectors \(\sum _{k=1}^K A_{ik}(v,\tau )\) with zero mean and finite variance. By calculation, we can get the asymptotic variance is

$$\begin{aligned} \Sigma&=lim_{n\rightarrow \infty } Eh\left\{ \sum _{k=1}^K A_{ik}(v,\tau )\right\} ^{\otimes 2}\\&=\sum _{k_i=1}^K\sum _{k_2=1,k_1\ne k_2}^K lim_{n\rightarrow \infty } Eh A_{1k_1}(v,\tau )A_{1k_2}(v,\tau )^\top +\sum _{k=1}^K lim_{n\rightarrow \infty } EhA_{ik}(v,\tau )^{\otimes 2}\\&= \Sigma _1+ \Sigma _{11}. \end{aligned}$$

As \(\sum _{i=1}^n A_{ik}(v,\tau )\) is a local square-integrable martingale, it can be easily obtained that \(\Sigma _{11}\) converges to \(\Sigma _2\), where \(\Sigma _2\) is \(diag\{\Gamma ^{-1}(v)\nu _0,Q_2\nu _2\}\).

By Theorem 4.1, we have \({\hat{\zeta }}(v) \rightarrow 0\) in probability. Therefore, based on the mean value theorem, we can obtain that

$$\begin{aligned}&\ell ''\{\gamma _0(v)+H^{-1}{\hat{\zeta }}(v),t\}= \ell ''\{\gamma _0(v),t\}+o_p(1)\\&\quad =\frac{1}{n}\sum _{k=1}^K\sum _{i=1}^n\int _0^t K_h(V_i-v)\\&\qquad \left\{ \frac{s_{kh,2}^*(s,\beta _0,g_0,v)s_{kh,0}^*(s,\beta _0,g_0,v)-s_{kh,1}^*(s,\beta _0,g_0,v)s_{kh,2}^*(s,\beta _0,g_0,v)^\top }{s_{kh,0}^*(s,\beta _0,g_0,v)^2}\right\} \\&\qquad \times d N_{ik}(s)+o_p(1)\\&\quad =\sum _{k=1}^K\int _0^t \left\{ \frac{s_{kh,2}^*(s,\beta _0,g_0,v)s_{kh,0}^*(s,\beta _0,g_0,v)-s_{kh,1}^*(s,\beta _0,g_0,v)s_{kh,2}^*(s,\beta _0,g_0,v)^\top }{s_{kh,0}^*(s,\beta _0,g_0,v)^2}\right\} \\&\qquad \times s_{kh,0}^*(s,\beta _0,g_0,v) d s+o_p(1)\\&\quad =-B(v,\tau )+o_p(1). \end{aligned}$$

As \({\hat{\zeta }}\) is the maximizer of function \(\ell \{\gamma _0(v)+H^{-1}\zeta (v),t\}\), we can have

$$\begin{aligned} -\ell '\{\gamma _0(v),t\}=\ell '\{\gamma _0(v)+H^{-1}{\hat{\zeta }}(v),t\}-\ell '\{\gamma _0(v),t\}=\ell ''\{\gamma _0(v)+H^{-1}{\hat{\zeta }}^{*}(v),t\}^\top {\hat{\zeta }}, \end{aligned}$$

where \({\hat{\zeta }}^{*}(v)\) lies between 0 and \({\hat{\zeta }}(v)\) (the second equality is obtained by Taylor expansion of \(\ell \{\gamma _0(v)+H^{-1}\zeta (v),t\}\) around 0). Hence, we can have

$$\begin{aligned}&{\hat{\zeta }}(v)-B(v,\tau )^{-1}A(v,\tau )=\\&\quad -\ell ''\{\gamma _0(v)+H^{-1}{\hat{\zeta }}^*(v),t\}^{-1} [\ell '\{\gamma _0(v),t\}-A(v,\tau )]+o_p(1). \end{aligned}$$

By Slutsky’s theorem, we can have

$$\begin{aligned} \sqrt{nh} \{{\hat{\zeta }}(v)-B(v,\tau )^{-1}A(v,\tau )\}\rightarrow _d N\{0, B(v,\tau )^{-1}\Sigma B(v,\tau )^{-1}\}. \end{aligned}$$

This completes the proof of Theorem 4.2. \(\square \)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Wang, W., Wang, Y. & Zhao, X. Semiparametric analysis of multivariate panel count data with nonlinear interactions. Lifetime Data Anal (2021). https://doi.org/10.1007/s10985-021-09537-1

Download citation

Keywords

  • Multivariate panel count data
  • Varying coefficient
  • Local logarithm partial likelihood function
  • Taylor expansion