Abstract
Variable selection is an important issue in all regression analysis and in this paper, we discuss this in the context of regression analysis of recurrent event data. Recurrent event data often occur in long-term studies in which individuals may experience the events of interest more than once and their analysis has recently attracted a great deal of attention (Andersen et al., Statistical models based on counting processes, 1993; Cook and Lawless, Biometrics 52:1311–1323, 1996, The analysis of recurrent event data, 2007; Cook et al., Biometrics 52:557–571, 1996; Lawless and Nadeau, Technometrics 37:158-168, 1995; Lin et al., J R Stat Soc B 69:711–730, 2000). However, it seems that there are no established approaches to the variable selection with respect to recurrent event data. For the problem, we adopt the idea behind the nonconcave penalized likelihood approach proposed in Fan and Li (J Am Stat Assoc 96:1348–1360, 2001) and develop a nonconcave penalized estimating function approach. The proposed approach selects variables and estimates regression coefficients simultaneously and an algorithm is presented for this process. We show that the proposed approach performs as well as the oracle procedure in that it yields the estimates as if the correct submodel was known. Simulation studies are conducted for assessing the performance of the proposed approach and suggest that it works well for practical situations. The proposed methodology is illustrated by using the data from a chronic granulomatous disease study.
Similar content being viewed by others
References
Akaike H (1973) Maximum likelihood identification of Gaussian autoregressive moving average models. Biometrika 60: 255–265
Andersen PK, Gill RD (1982) Cox’s regression model for counting processes: a large sample study. Ann Stat 10: 1100–1120
Andersen PK, Borgan O, Gill RD, Keiding N (1993) Statistical models based on counting processes. Springer, New York
Antoniadis A (1997) Wavelets in statistics: a review (with discussion). J Italian Stat Soc 6: 97–144
Breiman L (1996) Heuristics of instability and stabilization in model selection. Ann Stat 24: 2350–2383
Cai J, Schaubel DE (2004) Marginal means/rates models for multiple type recurrent event data. Lifetime Data Anal 10: 121–138
Cai J, Fan J, Li R, Zhou H (2005) Variable selection for multivariate failure time data. Biometrika 92: 303–316
Cook RJ, Lawless JF (1996) Interim monitoring of longitudinal comparative studies with recurrent event responses. Biometrics 52: 1311–1323
Cook RJ, Lawless JF (2007) The analysis of recurrent event data. Springer, New York
Cook RJ, Lawless JF, Nadeau JC (1996) Robust tests for treatment comparisons based on recurrent event responses. Biometrics 52: 557–571
Fan J (1997) Comment on ‘Wavelets in statistics: a review’ by A. Antoniadis. J Italian Stat Soc 6: 131–138
Fan J, Li R (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. J Am Stat Assoc 96: 1348–1360
Fan J, Li R (2002) Variable selection for Cox’s proportional hazards model and Frailty model. Ann Stat 30: 74–99
Fan J, Peng H (2004) Nonconcave penalized likelihood with a diverging number of parameters. Ann Stat 32: 928–961
Fleming TR, Harrington DP (1991) Counting process and survival analysis. Wiley, New York
Knight K, Fu W (2000) Asymptotics for Lasso-type estimators. Ann Stat 28: 1356–1378
Lawless JF, Nadeau C (1995) Some simple robust methods for the analysis of recurrent events. Technometrics 37: 158–168
Li R, Liang H (2008) Variable selection in semiparametric regression modeling. Ann Stat 6: 261–286
Lin DY, Wei LJ, Yang I, Ying Z (2000) Semiparametric regression for the mean and rate function of recurrent events. J R Stat Soc B 69: 711–730
Lin DY, Wei LJ, Ying Z (2001) Semiparametric transformation models for point processes. J Am Stat Assoc 96: 620–628
Mallows CL (1973) Some comments on Cp. Technometrics 15: 661–675
Prentice RL, Williams BJ, Peterson AV (1981) On the regression analysis of multivariate failure time data. Biometrika 68: 373–379
Schwartz G (1978) Estimating the dimension of a model. Ann Stat 6: 461–464
Tibshirani RJ (1996) Regression shrinkage and selection via the Lasso. J R Stat Soc B 58: 267–288
Tibshirani RJ (1997) The Lasso method for variable selection in the Cox model. Stat Med 16: 385–395
Wei LJ, Lin DY, Weissfeld L (1989) Regression analysis of multivariate incomplete failure time data by modeling marginal distributions. J Am Stat Assoc 84: 1065–1073
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Tong, X., Zhu, L. & Sun, J. Variable selection for recurrent event data via nonconcave penalized estimating function. Lifetime Data Anal 15, 197–215 (2009). https://doi.org/10.1007/s10985-008-9104-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10985-008-9104-2