Semiparametric analysis of multivariate panel count data with nonlinear interactions

Wang, Weiwei; Wang, Yijun; Zhao, Xiaobing

doi:10.1007/s10985-021-09537-1

Semiparametric analysis of multivariate panel count data with nonlinear interactions

Published: 05 October 2021

Volume 28, pages 89–115, (2022)
Cite this article

Lifetime Data Analysis Aims and scope Submit manuscript

Weiwei Wang^1,2,
Yijun Wang^1,2 &
Xiaobing Zhao³

1 Citation
Explore all metrics

Abstract

Multivariate panel count data frequently arise in follow up studies involving several related types of recurrent events. For univariate panel count data, several varying coefficient models have been developed. However, varying coefficient models for multivariate panel count data remain to be studied. In this paper, we propose a varying coefficient mean model for multivariate panel count data to describe the possible nonlinear interact effects between the covariates and the local logarithm partial likelihood procedure is considered to estimate the unknown covariate effects. Furthermore, a Breslow-type estimator is constructed for the baseline mean functions. The consistency and asymptotic normality of the proposed estimators are established under some mild conditions. The utility of the proposed approach is evaluated by some numerical simulations and an application to a dataset of skin cancer study.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Polynomial spline estimation of panel count data model with an unknown link function

Article 02 October 2022

Semiparametric partially linear varying coefficient models with panel count data

Article 27 April 2016

Joint analysis of panel count data with an informative observation process and a dependent terminal event

Article 23 July 2016

References

Cai J, Fan J, Zhou H, Zhou Y (2007) Hazard models with varying coefficients for multivariate failure time data. Ann Statist 35(1):324–354
Article MathSciNet Google Scholar
Carroll RJ, Fan J, Gijbels I, Wand MP (1997) Generalized partially linear single-index models. J Am Statist Assoc 92(438):477–489
Article MathSciNet Google Scholar
Fan J, Gijbels I, King M (1997) Local likelihood and local partial likelihood in hazard regression. Ann Statist 25(4):1661–1690
Article MathSciNet Google Scholar
He X, Feng X, Tong X, Zhao X (2017) Semiparametric partially linear varying coefficient models with panel count data. Lifetime Data Anal 23(3):439–466
Article MathSciNet Google Scholar
He X, Tong X, Sun J (2009) Semiparametric analysis of panel count data with correlated observation and follow-up times. Lifetime Data Anal 15(2):177
Article MathSciNet Google Scholar
He X, Tong X, Sun J, Cook RJ (2008) Regression analysis of multivariate panel count data. Biostatistics 9(2):234–248
Article Google Scholar
Hu XJ, Sun J, WEI LJ (2003) Regression parameter estimation from panel counts. Scandinavian J Statist 30(1):25–43
Article MathSciNet Google Scholar
Li N, Park D, Sun J, Kim K (2011) Semiparametric transformation models for multivariate panel count data with dependent observation process. Can J Statist 39(3):458–474
MathSciNet MATH Google Scholar
Li Y, He X, Wang H, Zhang B, Sun J (2015) Semiparametric regression of multivariate panel count data with informative observation times. J Multivariate Anal 140:209–219
Article MathSciNet Google Scholar
Lu M, Zhang Y (2007) Estimation of the mean function with panel count data using monotone polynomial splines. Biometrika 94(3):705–718
Article MathSciNet Google Scholar
Lu M, Zhang Y, Huang J (2009) Semiparametric estimation methods for panel count data using monotone b-splines. J Am Statist Assoc 104(487):1060–1070
Article MathSciNet Google Scholar
Sun J, Zhao X (2013) Statistical Analysis of Panel Count Data. Springer, New York
Book Google Scholar
Tibshirani R, Hastie T (1987) Local likelihood estimation. J Am Statist Assoc 82(398):559–567
Article MathSciNet Google Scholar
Wang Y, Yu Z (2019) A kernel regression model for panel count data with time-varying coefficients. arXiv: Statistics Theory
Wang Y, Yu Z (2021) A kernel regression model for panel count data with nonparametric covariate functions. Biometrics. https://doi.org/10.1111/biom.13440
Zhang H, Zhao H, Sun J, Wang D, Kim K (2013) Regression analysis of multivariate panel count data with an informative observation process. J Multivariate Anal 119:71–80
Article MathSciNet Google Scholar
Zhao H, Li Y, Sun J (2013a) Analyzing panel count data with a dependent observation process and a terminal event. Can J Statist 41(1):174–191
Article MathSciNet Google Scholar
Zhao H, Li Y, Sun J (2013b) Semiparametric analysis of multivariate panel count data with dependent observation processes and a terminal event. J Nonparametr Statist 25(2):379–394
Article MathSciNet Google Scholar
Zhao H, Tu W, Yu Z (2018) A nonparametric time-varying coefficient model for panel count data. J Nonparametr Statist 30(3):640–661
Article MathSciNet Google Scholar
Zhao H, Virkler K, Sun J (2014) Nonparametric comparison for multivariate panel count data. Commun Statist Theory Methods 43(3):644–655
Article MathSciNet Google Scholar
Zhao X, Tong X (2011) Semiparametric regression analysis of panel count data with informative observation times. Comput Statist Data Anal 55(1):291–300
Article MathSciNet Google Scholar
Zhao X, Tong X, Sun J (2013) Robust estimation for panel count data with informative observation times. Comput Statist Data Anal 57(1):33–40
Article MathSciNet Google Scholar

Download references

Acknowledgements

This paper was partially supported by the National Natural Science Foundation of China under Grand No. 12001485 and 12101549, the Ministry of Education Humanities and Social Sciences Research Youth Project of China under Grand No. 21YJCZH153, the National Bureau of Statistics of China under Grand No. 2020LY073 and the Characteristic & Preponderant Discipline of Key Construction Universities in Zhejiang Province (Zhejiang Gongshang University-Statistics).

Author information

Authors and Affiliations

School of Statistics and Mathematics, Zhejiang Gongshang University, Hangzhou, Zhejiang Province, China
Weiwei Wang & Yijun Wang
Collaborative Innovation Center of Statistical Data Engineering, Technology and Application, Zhejiang Gongshang University, Hangzhou, Zhejiang Province, China
Weiwei Wang & Yijun Wang
School of Data Sciences, Zhejiang University of Finance and Economics, Hangzhou, Zhejiang Province, China
Xiaobing Zhao

Authors

Weiwei Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yijun Wang
View author publications
You can also search for this author in PubMed Google Scholar
Xiaobing Zhao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yijun Wang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix I: tables and figures

See Tables 1, 2, 3 and Figs. 1, 2, 3, 4, 5, 6, 7 and 8.

Table 1 BIAS, SSE, BSE and 95% CP for the nonparametric parts $\beta (\cdot )$, $g'(\cdot )$ and $g(\cdot )$ on pre-specified grid-points of Case I

Full size table

Table 2 BIAS, SSE, BSE and 95% CP for the nonparametric parts $\beta (\cdot )$, $g'(\cdot )$ and $g(\cdot )$ on pre-specified grid-points of Case II

Full size table

Table 3 BIAS, SSE, BSE and 95% CP for the nonparametric parts $\beta (\cdot )$, $g'(\cdot )$ and $g(\cdot )$ on pre-specified grid-points of Case III

Full size table

Appendix II: proofs of asymptotic properties

The following lemma is needed in the proofs of the theorems, which is similar as Fan et al. (1997) and Cai et al. (2007). The detail proof of this lemma can be found in the paper of Cai et al. (2007).

Lemma 1

Define

$$\begin{aligned} c_{nk}(s,v)=\frac{1}{n}\sum _{i=1}^n Y_i(s)\Psi \{Z_i,V_i,(V_i-v)/h\}K_h(V_i-v)o_{ik}(s), \end{aligned}$$

and

$$\begin{aligned} c_k(s,v)=f(v)\int E\{Y(s)\Psi (Z,V,w)o_{k}(s)|V=v\}K(w) dw, \end{aligned}$$

where $\Psi (\cdot ,\cdot ,\cdot )$ is continuous for its three arguments and $E\{\Psi (Z,V,w)|V=v\}$ is continuous at the point v. Suppose conditions (C1) and (C6) hold and $h\rightarrow 0$, $nh/\log n\rightarrow \infty $, then we have

$$\begin{aligned} \mathop {sup}\limits _{0\le s\le \tau }\sum _{k=1}^K|c_{nk}(s,v)-c_k(s,v)|\rightarrow _p 0. \end{aligned}$$

Furthermore, we can have

$$\begin{aligned} \mathop {sup}\limits _{0\le s\le \tau }\mathop {sup}\limits _{v\in B}\sum _{k=1}^K|c_{nk}(s,v)-c_k(s,v)|\rightarrow _p 0. \end{aligned}$$

where B is a compact set satisfying $inf_{v\in B} f(v)>0$.

:

Proof of Theorem 4.1

By the definition of ${{\tilde{N}}_{ik}}(t)$, we can have

$$\begin{aligned} M_{ik}(t)=\int _0^t Y_i(s)\left[ d{\tilde{N}_{ik}}(s)-\exp \{\beta _0(V_i)Z_i+g(V_i)\}\mu _{0k}(s)o_{ik}(s)ds\right] ,0\le t\le \tau , \end{aligned}$$

is a $\bigcup _{i=1}^n \mathcal {F}_{t,ik}$ martingale, where $\mathcal {F}_{t,ik}=\sigma \{{{\tilde{N}}_i}(s), Z_i,V_i,Y_i(s),0\le s\le t\}$, $i=1,\cdots ,n$, $k=1,\cdots ,K$. Define $\gamma _0(v)$ be the true values of $\gamma (v)$. Let $\zeta (v)=H\{\gamma (v)-\gamma _0(v)\}$, then we have

$$\begin{aligned}&\ell \{\gamma _0(v)+H^{-1}\zeta (v),t\}=\\&\quad \frac{1}{n}\sum _{k=1}^K\sum _{i=1}^n\int _0^t K_h(V_i-v)Y_i(s)\left[ \gamma _0(v)^\top X_i+\zeta (v)^\top X_i^*-\log S_{kh,0}\{s,\zeta (v)\}\right] d{{\tilde{N}}_{ik}}(s), \end{aligned}$$

where $X_i^*=H^{-1}X_i$ and

$$\begin{aligned} S_{kh,j}(s,\zeta ,v)&=\frac{1}{n}\sum _{i=1}^n K_h(V_i-v) Y_i(s)X_i^{*\otimes j}\exp \{\gamma _0(v)^\top X_i+\zeta (v)^\top X_i^*\}o_{ki}(s),\\&\quad j=0,1,2. \end{aligned}$$

Furthermore, we define

$$\begin{aligned} S^*_{kh,j}&(s,\beta ,g,v)=\frac{1}{n}\sum _{i=1}^n K_h(V_i-v) Y_i(s)X_i^{*\otimes j}\exp \{\beta (V_i) Z_i+g(V_i)\}o_{ki}(s), \end{aligned}$$

and

$$\begin{aligned} s^*_{kh,0}&(s,\beta ,g,v)=f(v)E[Y(s)\exp \{\beta (V) Z+g(V)\}o_{k}(s)|V=v],\\ s^*_{kh,1}&(s,\beta ,g,v)=f(v)E[Y(s)\exp \{\beta (V) Z+g(V)\}(Z,Zu_1^\top ,u_1^\top )^\top o_{k}(s)|V=v],\\ s^*_{kh,2}&(s,\beta ,g,v)=f(v)E[Y(s)\exp \{\beta (V) Z+g(V)\}\Pi (Z)o_{k}(s)|V=v], \end{aligned}$$

where

$$\begin{aligned}\Pi (Z)= \begin{pmatrix} ZZ^\top &{} ZZ^\top u_1^\top &{} Zu_1^\top \\ ZZ^\top u_1 &{}ZZ^\top u_2 &{} Zu_2\\ Zu_1 &{}Z u_2 &{}u_2\\ \end{pmatrix}. \end{aligned}$$

Then,

$$\begin{aligned}&\ell \{\gamma _0(v)+H^{-1}\zeta (v),t\}-\ell \{\gamma _0(v),t\}\\&\quad =\frac{1}{n}\sum _{k=1}^K\sum _{i=1}^n\int _0^t K_h(V_i-v)\left\{ \zeta (v)^\top X_i^*-\log \frac{S_{kh,0}(s,\zeta ,v)}{S_{kh,0}(s,0,v)}\right\} d{M_{ik}}(s)\\&\qquad +\sum _{k=1}^K\int _0^t \left\{ \zeta (v)^\top S^*_{kh,1}(s,\beta _0,g_0,v) -\log \frac{S_{kh,0}(s,\zeta ,v)}{S_{kh,0}(s,0,v)}S^*_{kh,0}(s,\beta _0,g_0,v)\right\} \mu _{0k}(s)ds\\&\quad =\sum _{k=1}^K I_{1k}\{\zeta (v),t\}+I_2\{\zeta (v),t\}. \end{aligned}$$

By Lemma 1, we can easily get that

$$\begin{aligned}&I_2\{\zeta (v),t\}\\&\quad =\sum _{k=1}^K\int _0^t \left[ \zeta (v)^\top s^*_{kh,1}(s,\beta _0,g_0,v) -\log \frac{s_{kh,0}(s,\zeta ,v)}{s_{kh,0}(s,0,v)}s^*_{kh,0}(s,\beta _0,g_0,v)\right] \\&\qquad \mu _{0k}(s)ds+o_p(1)\\&\quad =I^*_2\{\zeta (v),t\}+o_p(1). \end{aligned}$$

It can easily shown that the $I_2\{\zeta (v),t\}$ is strictly concave with respect to $\zeta (v)$ and it has the maximum value at $\zeta (v)=0$. Next, we can note $I_{1k}\{\zeta (v),t\}$ is a local square integrable martingale with the square variation process being

$$\begin{aligned} \langle I_{1k}\{\zeta (v),t\},I_{1k}\{\zeta (v),t\}\rangle =&\frac{1}{n^2}\sum _{i=1}^n\int _0^t K_h^2(V_i-v)Y_i(s)\\&\left\{ \zeta (v)^\top X_i^*-\log \frac{S_{kh,0}(s,\zeta ,v)}{S_{kh,0}(s,0,v)}\right\} ^2\\&\times \exp \{\beta _0(V_i)Z_i+g(V_i)\}\mu _{0k}(s)o_{ik}(s)ds, \end{aligned}$$

and based the Lemma 1, we can have

$$\begin{aligned} E I_{1k}^2\{\zeta (v),t\}=E\langle I_{1k}\{\zeta (v),t\},I_{1k}\{\zeta (v),t\}\rangle =O\left( \frac{1}{nh}\right) \rightarrow 0. \end{aligned}$$

Thus, it implies that $I_{1k}\{\zeta (v),t\}\rightarrow _p 0$ for $k=1,\cdots ,K$. Hence,

$$\begin{aligned} \ell \{\gamma _0(v)+H^{-1}\zeta (v),t\}-\ell \{\gamma _0(v),t\}=I^*_2\{\zeta (v),t\}+o_p(1). \end{aligned}$$

Then, we can have that $\ell \{\gamma _0(v)+H^{-1}\zeta (v),t\}-\ell \{\gamma _0(v),t\}$ is strictly concave with respect to $\zeta (v)$ and it has the maximum value at $\zeta (v)=0$. By Lemma A.1 of Carroll et al. (1997), ${\hat{\zeta }}(v)\rightarrow _p 0$. So

$$\begin{aligned} H\{{\hat{\gamma }}(v)-\gamma _0(v)\}\rightarrow _p 0. \end{aligned}$$

Similarly, we can get

$$\begin{aligned} \mathop {sup}\limits _{v\in \Phi _v}|H\{{\hat{\gamma }}(v)-\gamma _0(v)\}|\rightarrow _p 0. \end{aligned}$$

This completes the proof of Theorem 4.1

$\square $

:

Proof of Theorem 4.2

Denote

$$\begin{aligned}&\ell '\{\gamma _0(v),\tau \}=\frac{1}{n}\sum _{k=1}^K\sum _{i=1}^n\int _0^\tau K_h(V_i-v)\left\{ X_i^*-\frac{S_{kh,1}(s,0,v)}{S_{kh,0}(s,0,v)}\right\} d{M_{ik}}(s)\\&\qquad +\frac{1}{n}\sum _{k=1}^K\sum _{i=1}^n\int _0^\tau K_h(V_i-v)Y_i(s)\left\{ X_i^*-\frac{S_{kh,1}(s,0,v)}{S_{kh,0}(s,0,v)}\right\} \exp \{\beta _0(V_i)Z_i+g_0(V_i)\}\mu _{0k}(s)o_{ik}(s)ds\\&\quad =I_3(v,\tau )+I_4(v,\tau ). \end{aligned}$$

By the Taylor expansion and Lemma 1, we can have

$$\begin{aligned} I_4(v,\tau )=&\frac{1}{(d+1)!n}\sum _{k=1}^K\sum _{i=1}^n\int _0^\tau K_h(V_i-v)Y_i(s)\left\{ X_i^*-\frac{s^*_{kh,1}(s,\beta _0,g_0,v)}{s^*_{kh,0}(s,\beta _0,g_0,v)}\right\} \\&\times \exp \{\gamma _0(v)^\top X_i^*+g_0(v)\}\{\beta ^{(d+1)}_0(v)Z_i\\&+g^{(d+1)}_0(v)\}(V_i-v)^{d+1}\mu _{0k}(s)o_{ik}(s)ds\{1+O_p(h^d)\}\\ =&\frac{h^{d+1}}{(d+1)!}b[\Gamma (v)^{-1}\beta ^{(d+1)}_0(v), 0,\cdots ,0]^\top \{1+O_p(h^d)\}=A(v,\tau ), \end{aligned}$$

where $b=\int x^{d+1}K(x)dx$. Besides, we can have

$$\begin{aligned}&I_3(v,\tau )=\frac{1}{n}\sum _{i=1}^n\sum _{k=1}^K\int _0^\tau K_h(V_i-v)\left\{ X_i^*-\frac{s_{kh,1}(s,\zeta ,v)}{s_{kh,0}(s,\zeta ,v)}\right\} d{M_{ik}}(s)+o_p(1)\\&\quad =\frac{1}{n}\sum _{i=1}^n\sum _{k=1}^K A_{ik}(v,\tau )+o_p(1) \end{aligned}$$

It is easily note that $\sqrt{nh}I_3(v,\tau )$ is a sum of i.i.d. random vectors $\sum _{k=1}^K A_{ik}(v,\tau )$ with zero mean and finite variance. By calculation, we can get the asymptotic variance is

$$\begin{aligned} \Sigma&=lim_{n\rightarrow \infty } Eh\left\{ \sum _{k=1}^K A_{ik}(v,\tau )\right\} ^{\otimes 2}\\&=\sum _{k_i=1}^K\sum _{k_2=1,k_1\ne k_2}^K lim_{n\rightarrow \infty } Eh A_{1k_1}(v,\tau )A_{1k_2}(v,\tau )^\top +\sum _{k=1}^K lim_{n\rightarrow \infty } EhA_{ik}(v,\tau )^{\otimes 2}\\&= \Sigma _1+ \Sigma _{11}. \end{aligned}$$

As $\sum _{i=1}^n A_{ik}(v,\tau )$ is a local square-integrable martingale, it can be easily obtained that $\Sigma _{11}$ converges to $\Sigma _2$, where $\Sigma _2$ is $diag\{\Gamma ^{-1}(v)\nu _0,Q_2\nu _2\}$.

By Theorem 4.1, we have ${\hat{\zeta }}(v) \rightarrow 0$ in probability. Therefore, based on the mean value theorem, we can obtain that

$$\begin{aligned}&\ell ''\{\gamma _0(v)+H^{-1}{\hat{\zeta }}(v),t\}= \ell ''\{\gamma _0(v),t\}+o_p(1)\\&\quad =\frac{1}{n}\sum _{k=1}^K\sum _{i=1}^n\int _0^t K_h(V_i-v)\\&\qquad \left\{ \frac{s_{kh,2}^*(s,\beta _0,g_0,v)s_{kh,0}^*(s,\beta _0,g_0,v)-s_{kh,1}^*(s,\beta _0,g_0,v)s_{kh,2}^*(s,\beta _0,g_0,v)^\top }{s_{kh,0}^*(s,\beta _0,g_0,v)^2}\right\} \\&\qquad \times d N_{ik}(s)+o_p(1)\\&\quad =\sum _{k=1}^K\int _0^t \left\{ \frac{s_{kh,2}^*(s,\beta _0,g_0,v)s_{kh,0}^*(s,\beta _0,g_0,v)-s_{kh,1}^*(s,\beta _0,g_0,v)s_{kh,2}^*(s,\beta _0,g_0,v)^\top }{s_{kh,0}^*(s,\beta _0,g_0,v)^2}\right\} \\&\qquad \times s_{kh,0}^*(s,\beta _0,g_0,v) d s+o_p(1)\\&\quad =-B(v,\tau )+o_p(1). \end{aligned}$$

As ${\hat{\zeta }}$ is the maximizer of function $\ell \{\gamma _0(v)+H^{-1}\zeta (v),t\}$, we can have

$$\begin{aligned} -\ell '\{\gamma _0(v),t\}=\ell '\{\gamma _0(v)+H^{-1}{\hat{\zeta }}(v),t\}-\ell '\{\gamma _0(v),t\}=\ell ''\{\gamma _0(v)+H^{-1}{\hat{\zeta }}^{*}(v),t\}^\top {\hat{\zeta }}, \end{aligned}$$

where ${\hat{\zeta }}^{*}(v)$ lies between 0 and ${\hat{\zeta }}(v)$ (the second equality is obtained by Taylor expansion of $\ell \{\gamma _0(v)+H^{-1}\zeta (v),t\}$ around 0). Hence, we can have

$$\begin{aligned}&{\hat{\zeta }}(v)-B(v,\tau )^{-1}A(v,\tau )=\\&\quad -\ell ''\{\gamma _0(v)+H^{-1}{\hat{\zeta }}^*(v),t\}^{-1} [\ell '\{\gamma _0(v),t\}-A(v,\tau )]+o_p(1). \end{aligned}$$

By Slutsky’s theorem, we can have

$$\begin{aligned} \sqrt{nh} \{{\hat{\zeta }}(v)-B(v,\tau )^{-1}A(v,\tau )\}\rightarrow _d N\{0, B(v,\tau )^{-1}\Sigma B(v,\tau )^{-1}\}. \end{aligned}$$

This completes the proof of Theorem 4.2. $\square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, W., Wang, Y. & Zhao, X. Semiparametric analysis of multivariate panel count data with nonlinear interactions. Lifetime Data Anal 28, 89–115 (2022). https://doi.org/10.1007/s10985-021-09537-1

Download citation

Received: 05 September 2020
Accepted: 22 September 2021
Published: 05 October 2021
Issue Date: January 2022
DOI: https://doi.org/10.1007/s10985-021-09537-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Semiparametric analysis of multivariate panel count data with nonlinear interactions

Abstract

Access this article

Similar content being viewed by others

Polynomial spline estimation of panel count data model with an unknown link function

Semiparametric partially linear varying coefficient models with panel count data

Joint analysis of panel count data with an informative observation process and a dependent terminal event

References

Acknowledgements