Abstract
Modern data collection techniques have resulted in an increasing number of big clustered time-to-event data sets, wherein patients are often observed from a large number of healthcare providers. Semiparametric frailty models are a flexible and powerful tool for modeling clustered time-to-event data. In this manuscript, we first provide a computationally efficient approach based on a minimization–maximization algorithm to fit semiparametric frailty models in large-scale settings. We then extend the proposed method to incorporate complex data structures such as time-varying effects, for which many existing methods fail because of lack of computational power. The finite-sample properties and the utility of the proposed method are examined through an extensive simulation study and an analysis of the national kidney transplant data.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Clayton, D.G.: A model for association in bivariate life table and its application in epidemiological studies of familiar tendency in chronic disease incidence. Biometrika 65, 141–151 (1978)
Clayton, D.G., Cuzick, J.: Multivariate generalization of the proportional hazards model (with discussion). J. R. Stat. Soc. Ser. A 148, 82–117 (1985)
Klein, J.P.: Semiparametric estimation of random effects using the Cox model based on the EM algorithm. Biometrics 48, 795–806 (1992)
McGilchrist, C.A.: REML estimation for survival models with frailty. Biometrics 49, 221–225 (1993)
McGilchrist, C.A., Aisbett, C.W.: Regression with frailty in survival analysis. Biometrics 47, 461–466 (1991)
Yamaguchi, T., Ohashi, Y.: Investigating centre effects in a multi-centre clinical trial of superficial bladder cancer. Stat. Med. 18, 1961–1971 (1999)
He, K., Kalbfleisch, J.D., Li, Y., Li, Y.J.: Evaluating readmission rates in dialysis facilities with or without adjustment for hospital effects. Lifetime Data Anal. 19 (4), 490–512 (2013)
Dekker, F.W., de Mutsert, R., van Dijk, P.C., Zoccali, C., Jager, K.J.: Survival analysis: time-dependent effects and time-varying risk factors. Kidney Int. 74 (8), 994–997 (2008)
Zucker, D.M., Karr, A.F.: Nonparametric survival analysis with time-dependent covariate effects: a penalized partial likelihood approach. Ann. Stat. 18 (1), 329–353 (1990)
Gray, R.J.: Flexible methods for analyzing survival data using splines, with applications to breast cancer prognosis. Am. J. Kidney Dis. 87 (420), 942–951 (1992)
Gray, R.J.: Spline-based tests in survival analysis. Biometrics 50 (3), 640–652 (1994)
Hastie, T., Tibshirani, R.: Varying-coefficient models. J. R. Stat. Soc. Ser. B 55, 757–796 (1993)
Verweij, P.J.M., van Houwelingen, H.C.: Time-dependent effects of fixed covariates in cox regression. Biometrics 51, 1550–1556 (1995)
Berger, U., Schäer, J., Ulm, K.: Dynamic Cox modelling based on fractional polynomials: time-variations in gastric cancer prognosis. Stat. Med. 22 (7), 1163–1180 (2003)
Perperoglou, A., le Cessie, S., van Houwelingen, H.C.: A fast routine for fitting Cox models with time varying effects of the covariates. Comput. Methods Prog. Biomed. 25, 154–161 (2006)
Hunter, D.R., Lange, K.: A tutorial on MM algorithms. Am. Stat. 58, 30–37 (2004)
Lange, K., Hunter, D.R., Yang, I.: Optimization transfer using surrogate objective functions (with discussion). J. Comput. Graph. Stat. 9, 1–20 (2000)
Lange, K.: Optimization, 2nd edn. Springer Texts in Statistics. Springer, New York (2012)
Wu, T.T., Lange, K.: The MM alternative to EM. Stat. Sci. 29, 492–505 (2010)
Duchateau, L., Janssen, P.: Springer Texts in Statistics. Springer, New York (2008)
Yang, Y., Zou, H.: A cocktail algorithm for solving the elastic net penalized cox’s regression in high dimensions. Stat. Interface 6, 167–173 (2013)
Therneau, T.M., Grambsch, P.M.: Modeling Survival Data, Extending the Cox Model. Springer, New York (2000)
Morris, J.S.: He BLUPs are not “best” when it comes to bootstrapping. Stat. Probab. Lett. 56, 425–430 (2002)
Pencina, M.J., D’Agostino, R.B.: Overall C as a measure of discrimination in survival analysis: model specific population value and confidence interval estimation. Stat. Med. 23 (13), 2109–2023 (2004)
Lee, Y., Nelder, J.A.: Hierarchical generalized linear models. J. R. Stat. Soc. Ser. B 58, 619–678 (1996)
Jeon, J., Hsu, L., Gorfine, M.: Bias correction in the hierarchical likelihood approach to the analysis of multivariate survival data. Biostatistics 13 (3), 384–97 (2012)
Acknowledgements
This work was supported in part by Health Resources and Services Administration contract 234-2005-37011C. The content is the responsibility of the authors alone and does not necessarily reflect the views or policies of the Department of Health and Human Services, nor does mention of trade names, commercial products, or organizations imply endorsement by the US Government. Yi Li’s research is partly supported by the Chinese Natural Science Foundation (11528102).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this chapter
Cite this chapter
He, K., Li, Y., Wei, Q., Li, Y. (2017). A Computationally Efficient Approach for Modeling Complex and Big Survival Data. In: Ahmed, S. (eds) Big and Complex Data Analysis. Contributions to Statistics. Springer, Cham. https://doi.org/10.1007/978-3-319-41573-4_10
Download citation
DOI: https://doi.org/10.1007/978-3-319-41573-4_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-41572-7
Online ISBN: 978-3-319-41573-4
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)