Skip to main content

Advertisement

Log in

Polynomial spline estimation of panel count data model with an unknown link function

  • Regular Article
  • Published:
Statistical Papers Aims and scope Submit manuscript

Abstract

Panel count data are frequently encountered in follow-up studies such as clinical trials, reliability researches, and insurance studies. Models about this type data usually assume the linearity form of the covariate variables on the log conditional mean function. However, the linearity assumption cannot be always guaranteed in practical applications, especially when high-dimensional covariates exist under investigation. In this paper, we propose a more flexible conditional mean regression model of panel count data with an unknown link function to describe the possible nonlinearity of the covariate effects. The partial likelihood procedure is developed to estimate the unknown link function and the regression parameters simultaneously by first approximating the unknown link function by polynomial splines, and then a two-step iterative algorithm is developed for computing implementation. Finally, the Breslow-type estimator is constructed for the baseline mean function. Asymptotic results of the proposed estimators are discussed under some regularity conditions. In addition, penalized spline estimation procedure is also introduced as an extension. Extensive numerical studies are carried out and indicate that the proposed procedure works well. Finally, two applications of bladder cancer study and skin cancer study are also presented for illustration.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Dong C, Gao J, Tjstheim D (2015) Estimation for single-index and partially linear single-index integrated models. Ann Stat 44(1):425–453

    MathSciNet  Google Scholar 

  • Feng Y, Wang Y, Wang W, Chen Z (2021) Robust estimation of semiparametric transformation model for panel count data. J Syst Sci Complexity 34(6):2334–2356

    Article  MathSciNet  MATH  Google Scholar 

  • Härdle W, Stoker TM (1989) Investigating smooth multiple regression by the method of average derivatives. J Am Stat Assoc 84(408):157–178

    MathSciNet  MATH  Google Scholar 

  • Härdle W, Hall P, Ichimura H (1993) Optimal smoothing in single-index models. Ann Stat 21(1):157–178

    Article  MathSciNet  MATH  Google Scholar 

  • He X, Tong X, Sun J (2009) Semiparametric analysis of panel count data with correlated observation and follow-up times. Lifetime Data Anal 15(2):177–196

    Article  MathSciNet  MATH  Google Scholar 

  • Hu X, Sun J, Wei LJ (2003) Regression parameter estimation from panel counts. Scand J Stat 30(1):25–43

    Article  MathSciNet  MATH  Google Scholar 

  • Huang J, Liu L (2006) Polynomial spline estimation and inference of proportional hazards regression models with flexible relative risk form. Biometrics 62(3):793–802

    Article  MathSciNet  MATH  Google Scholar 

  • Ichimura H (1993) Semiparametric least squares (SLS) and weighted SLS estimation of single-index models. J Econ 58(1):71–120

    MathSciNet  MATH  Google Scholar 

  • Kong E, Xia Y (2007) Variable selection for the single-index model. Biometrika 94(1):217–229

    Article  MathSciNet  MATH  Google Scholar 

  • Li J, Zhang R (2011) Partially varying coefficient single-index proportional hazards regression models. Comput Stat Data Anal 55(1):389–400

    Article  MathSciNet  MATH  Google Scholar 

  • Li N, Zhao H, Sun J (2013) Semiparametric transformation models for panel count data with correlated observation and follow-up times. Stat Med 32(17):3039–3054

    Article  MathSciNet  Google Scholar 

  • Li J, Li Y, Zhang R (2017) B-spline variable selection for the single-index models. Stat Pap 58:691–706

    Article  MathSciNet  MATH  Google Scholar 

  • Ling N, Cheng L, Vieu P, Ding H (2022) Missing responses at random in functional single index model for time series data. Stat Pap 63(2):665–692

    Article  MathSciNet  MATH  Google Scholar 

  • Liu L (2004) Semiparametric and nonparametric models for survival data. Ph. D. thesis, University of Pennsylvania

  • Lu M, Li C (2017) Penalized estimation for proportional hazards models with current status data. Stat Med 36(30):4893–4907

    Article  MathSciNet  Google Scholar 

  • Lu M, Zhang Y (2007) Estimation of the mean function with panel count data using monotone polynomial splines. Biometrika 94(3):705–718

    Article  MathSciNet  MATH  Google Scholar 

  • Lu M, Zhang Y, Huang J (2009) Semiparametric estimation methods for panel count data using monotone B-splines. J Am Stat Assoc 104(487):1060–1070

    Article  MathSciNet  MATH  Google Scholar 

  • Nielsen JD, Dean CB (2008) Clustered mixed nonhomogeneous Poisson process spline models for the analysis of recurrent event panel data. Biometrics 64(3):751–761

    Article  MathSciNet  MATH  Google Scholar 

  • Qin F, Yu Z (2021) Penalized spline estimation for panel count data model with time-varying coefficients. Comput Stat 36(2):2413–2434

    Article  MathSciNet  MATH  Google Scholar 

  • Sun Y (2010) Estimation of semiparametric regression model with longitudinal data. Lifetime Data Anal 16(2):271–298

    Article  MathSciNet  MATH  Google Scholar 

  • Sun J, Kalbfleisch JD (1995) Estimation of the mean function of point processes based on panel data. Stat Sin 5(1):279–289

    MathSciNet  MATH  Google Scholar 

  • Sun J, Zhao X (2013) Statistical analysis of panel count data. Springer, New York

    Book  MATH  Google Scholar 

  • Sun J, Kopciuk KA, Lu X (2008) Polynomial spline estimation of partially linear single-index proportional hazards regression models. Comput Stat Data Anal 53(1):176–188

    Article  MathSciNet  MATH  Google Scholar 

  • Wang W (2004) Proportional hazards regression models with unknown link function and time-dependent covariates. Stat Sin 14(3):885–905

    MathSciNet  MATH  Google Scholar 

  • Wang Y, Wang W, Zhao X (2022) Local logarithm partial likelihood estimation of panel count data model with an unknown link function. Comput Stat Data Anal 166(4):107346

    Article  MathSciNet  MATH  Google Scholar 

  • Wellner J, Zhang Y (2000) Two estimators of the mean of a counting process with panel count data. Ann Stat 28(3):779–814

    Article  MathSciNet  MATH  Google Scholar 

  • Wu T, Yu K, Yan Y (2010) Single-index quantile regression. J Multivar Anal 101(7):1607–1621

    Article  MathSciNet  MATH  Google Scholar 

  • Xue L, Liang H (2010) Polynomial spline estimation for a generalized additive coefficient model. Scand J Stat 37(1):26–46

    Article  MathSciNet  MATH  Google Scholar 

  • Xue L, Zhu L (2006) Empirical likelihood for single-index models. J Multivar Anal 97(6):1295–1312

    Article  MathSciNet  MATH  Google Scholar 

  • Yu P, Du J, Zhang Z (2020) Single-index partially functional linear regression model. Stat Pap 61(3):1107–1123

    Article  MathSciNet  MATH  Google Scholar 

  • Zhang H, Sun J, Wang D (2013) Variable selection and estimation for multivariate panel count data via the seamless-\({L}_{{m_0}}\) penalty. Can J Stat 41(2):368–385

    Article  MATH  Google Scholar 

  • Zhao X, Tong X, Sun J (2013) Robust estimation for panel count data with informative observation times. Comput Stat Data Anal 57(1):33–40

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

This paper was partially supported by the National Natural Science Foundation of China under Grand No. 12101549 and 12001485, the Ministry of Education Humanities and Social Sciences Research Youth Project of China under Grand No. 21YJCZH153, the Characteristic & Preponderant Discipline of Key Construction Universities in Zhejiang Province (Zhejiang Gongshang University-Statistics) and the Fundamental Research Funds for the Provincial Universities of Zhejiang.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Weiwei Wang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix I: Tables and figures

See Tables 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 and Figs. 1, 2, 3, 4, 5, 6.

Table 1 Summary statistics for angles of \({\hat{\beta }}\) and \(\beta \) based on known link function, unknown link function and identity link function of Case I
Table 2 Results of the coefficients \(\beta \) based on known link function, unknown link function and identity link function of Case I
Table 3 Summary statistics for angles of the \({\hat{\beta }}\) and \(\beta \) based on known link function, unknown link function and identity link function of Case II
Table 4 Results of the coefficients \(\beta \) based on known link function, unknown link function and identity link function of Case II
Table 5 Summary statistics for angles of the \({\hat{\beta }}\) and \(\beta \) based on known link function, unknown link function and identity link function of Case III
Table 6 Results of the coefficients \(\beta \) based on unknown link function and identity link function of Case III
Table 7 Comparisons between polynomial spline estimation, penalized spline estimation and kernel estimation of Case IV
Table 8 Variable selection results of \(\beta \) for Case V
Table 9 Sparse estimation results of the nonzero coefficients of \(\beta \) of Case V
Table 10 Estimated values (Est), standard errors (SE) and \(95\%\) confidence intervals (CI) of \(\beta \) for the bladder tumor data
Table 11 Estimated values (Est), standard errors (SE) and \(95\%\) confidence intervals (CI) of \(\beta \) for the skin cancer data
Fig. 1
figure 1

Estimated curves for baseline function \(\mu _0\) and link function \(\psi \) of Case I

Fig. 2
figure 2

Estimated curves for baseline function \(\mu _0\) and link function \(\psi \) of Case II

Fig. 3
figure 3

Estimated curves for baseline function \(\mu _0\) and link function \(\psi \) of Case III

Fig. 4
figure 4

Boxplots of RASEs for \(\mu _0\) and \(\psi \) of Case I–III

Fig. 5
figure 5

Estimated curve of the link function \(\psi \) for the bladder cancer data

Fig. 6
figure 6

Estimated curves of the link function \(\psi \) for the skin cancer data

Appendix II: Formulas for \(H(\alpha ,\gamma )\) and \(\mathrm{var}({\hat{\beta }},{\hat{\gamma }})\)

The Hessian matrix of \(\alpha \) and \(\gamma \) is defined as

$$\begin{aligned} H(\alpha ,\gamma )=\left( \begin{array}{l} H_{\alpha ,\alpha },H_{\alpha ,\gamma } \\ H_{\alpha ,\gamma }^\top ,H_{\gamma ,\gamma }\\ \end{array}\right) . \end{aligned}$$

Let \({\tilde{Z}}_i=(Z_{i2},\ldots ,Z_{ip})^\top \) and \(\xi _i=-Z_{i1}\alpha /(1-\Vert \alpha \Vert _2^2)^{1/2}+{\tilde{Z}}_i\), \(i=1,\ldots ,n\). Denote \(A=(a_{ij})\) as a \((p-1)\times (p-1)\) matrix with entries \(a_{ii}=1+\alpha _i^2/(1-\Vert \alpha \Vert _2^2)\) and \(a_{ij}=\alpha _i\alpha _j/(1-\Vert \alpha \Vert _2^2)\) if \(i\ne j\). Then we can have

$$\begin{aligned}H_{\alpha ,\alpha }=H_9+H_{10}+H_{11}+H_{12}+H_{13},\end{aligned}$$

where

$$\begin{aligned} H_{9}=n^{-1}\sum _{i=1}^n\int _0^\tau {Y_i(s)\left\{ \gamma ^\top B''(\beta ^\top Z_i)\xi _i\xi _i^\top +\gamma ^\top B'(\beta ^\top Z_i)\frac{-Z_{i1}}{\sqrt{1-\Vert \alpha \Vert _2^2}}A\right\} d{{\tilde{N}}_i}(s)},\\H_{10}=-n^{-1}\sum _{i=1}^n\int _0^\tau {Y_i(s)\left\{ \sum _{j=1}^n w_{j}(s,\beta ,\gamma )\gamma ^\top B''(\beta ^\top Z_j)\xi _j\xi _j^\top \right\} d{{\tilde{N}}_i}(s)},\\H_{11}=-n^{-1}\sum _{i=1}^n\int _0^\tau {Y_i(s)\left\{ \sum _{j=1}^n w_{j}(s,\beta ,\gamma )\gamma ^\top B'(\beta ^\top Z_j)\frac{-Z_{i1}}{\sqrt{1-\Vert \alpha \Vert _2^2}}A\right\} d{{\tilde{N}}_i}(s)},\\H_{12}=-n^{-1}\sum _{i=1}^n\int _0^\tau {Y_i(s)\left\{ \sum _{j=1}^n w_{j}(s,\beta ,\gamma \{\gamma ^\top B'(\beta ^\top Z_j)\}^2]\xi _j\xi _j^\top \right\} d{{\tilde{N}}_i}(s)},\end{aligned}$$

and

$$\begin{aligned} H_{13}=n^{-1}\sum _{i=1}^n\int _0^\tau {Y_i(s)\left\{ \sum _{j=1}^n w_{j}(s,\beta ,\gamma )\gamma ^\top B'(\beta ^\top Z_j)\xi _j\sum _{j=1}^n w_{j}(s,\beta ,\gamma )\gamma ^\top B'(\beta ^\top Z_j)\xi _j^\top \right\} d{{\tilde{N}}_i}(s)}. \end{aligned}$$

The matrix \(H_{\alpha ,\gamma }\) is given as \(H_{\alpha ,\gamma }=H_{14}+H_{15}+H_{16}\), where

$$\begin{aligned} H_{14}= & {} n^{-1}\sum _{i=1}^n\int _0^\tau {Y_i(s)\left\{ \xi _i B'^\top (\beta ^\top Z_i)\right\} d{{\tilde{N}}_i}(s)}, \\ H_{15}= & {} -n^{-1}\sum _{i=1}^n\int _0^\tau {Y_i(s)\left\{ \sum _{j=1}^n w_{j}(s,\beta ,\gamma ) [\xi _j B'^\top (\beta ^\top Z_j)+\gamma ^\top B'(\beta ^\top Z_j)\xi _jB^\top (\beta ^\top Z_j)\right\} d{\tilde{N}_i}(s)}, \end{aligned}$$

and

$$\begin{aligned} H_{16}=n^{-1}\sum _{i=1}^n\int _0^\tau {Y_i(s)\left\{ \sum _{j=1}^n w_{j}(s,\beta ,\gamma ) \gamma ^\top B'(\beta ^\top Z_j)\xi _j\sum _{j=1}^n w_{j}(s,\beta ,\gamma ) B^\top (\beta ^\top Z_j)\right\} d{{\tilde{N}}_i}(s)}. \end{aligned}$$

The matrix \(H_{\gamma ,\gamma }\) is the same as the one given in Subsection (3.1). From the transformation, \(\beta =((1-\Vert \alpha \Vert _2^2)^{1/2}, \alpha _1,\ldots ,\alpha _{p-1})^\top \), we can easily get

$$\begin{aligned} \frac{\partial (\beta ,\gamma )}{\partial (\alpha ,\gamma )}=\left( \begin{array}{l} \frac{{\hat{\beta }}_2}{{\hat{\beta }}_1},\ldots , \frac{{\hat{\beta _p}}}{{\hat{\beta }}_1},\quad 0_{1\times (L)} \\ \quad \quad I_{p-1+L}\\ \end{array}\right) . \end{aligned}$$

Therefore, by the delta method, we can get

$$\begin{aligned} \mathrm{var}({\hat{\beta }},{\hat{\gamma }})=\left( \begin{array}{l} \frac{{\hat{\beta }}_2}{{\hat{\beta }}_1},\ldots , \frac{{\hat{\beta _p}}}{{\hat{\beta }}_1}, \quad 0_{1\times (L)} \\ \quad \quad I_{p-1+L}\\ \end{array}\right) \{-H({\hat{\alpha }},{\hat{\gamma }})\}^{-1} \left( \begin{array}{l} \frac{{\hat{\beta }}_2}{{\hat{\beta }}_1},\ldots , \frac{{\hat{\beta _p}}}{{\hat{\beta }}_1},\quad 0_{1\times (L)} \\ \quad \quad I_{p-1+L}\\ \end{array}\right) ^\top . \end{aligned}$$

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, Y., Wang, W. & Zhao, X. Polynomial spline estimation of panel count data model with an unknown link function. Stat Papers 64, 1805–1832 (2023). https://doi.org/10.1007/s00362-022-01364-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00362-022-01364-2

Keywords

Navigation