Skip to main content

A Computationally Efficient Approach for Modeling Complex and Big Survival Data

  • Chapter
  • First Online:
Big and Complex Data Analysis

Part of the book series: Contributions to Statistics ((CONTRIB.STAT.))

Abstract

Modern data collection techniques have resulted in an increasing number of big clustered time-to-event data sets, wherein patients are often observed from a large number of healthcare providers. Semiparametric frailty models are a flexible and powerful tool for modeling clustered time-to-event data. In this manuscript, we first provide a computationally efficient approach based on a minimization–maximization algorithm to fit semiparametric frailty models in large-scale settings. We then extend the proposed method to incorporate complex data structures such as time-varying effects, for which many existing methods fail because of lack of computational power. The finite-sample properties and the utility of the proposed method are examined through an extensive simulation study and an analysis of the national kidney transplant data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 139.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Clayton, D.G.: A model for association in bivariate life table and its application in epidemiological studies of familiar tendency in chronic disease incidence. Biometrika 65, 141–151 (1978)

    Article  MathSciNet  MATH  Google Scholar 

  2. Clayton, D.G., Cuzick, J.: Multivariate generalization of the proportional hazards model (with discussion). J. R. Stat. Soc. Ser. A 148, 82–117 (1985)

    Article  MATH  Google Scholar 

  3. Klein, J.P.: Semiparametric estimation of random effects using the Cox model based on the EM algorithm. Biometrics 48, 795–806 (1992)

    Article  Google Scholar 

  4. McGilchrist, C.A.: REML estimation for survival models with frailty. Biometrics 49, 221–225 (1993)

    Article  Google Scholar 

  5. McGilchrist, C.A., Aisbett, C.W.: Regression with frailty in survival analysis. Biometrics 47, 461–466 (1991)

    Article  Google Scholar 

  6. Yamaguchi, T., Ohashi, Y.: Investigating centre effects in a multi-centre clinical trial of superficial bladder cancer. Stat. Med. 18, 1961–1971 (1999)

    Article  Google Scholar 

  7. He, K., Kalbfleisch, J.D., Li, Y., Li, Y.J.: Evaluating readmission rates in dialysis facilities with or without adjustment for hospital effects. Lifetime Data Anal. 19 (4), 490–512 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  8. Dekker, F.W., de Mutsert, R., van Dijk, P.C., Zoccali, C., Jager, K.J.: Survival analysis: time-dependent effects and time-varying risk factors. Kidney Int. 74 (8), 994–997 (2008)

    Article  Google Scholar 

  9. Zucker, D.M., Karr, A.F.: Nonparametric survival analysis with time-dependent covariate effects: a penalized partial likelihood approach. Ann. Stat. 18 (1), 329–353 (1990)

    Article  MathSciNet  MATH  Google Scholar 

  10. Gray, R.J.: Flexible methods for analyzing survival data using splines, with applications to breast cancer prognosis. Am. J. Kidney Dis. 87 (420), 942–951 (1992)

    Google Scholar 

  11. Gray, R.J.: Spline-based tests in survival analysis. Biometrics 50 (3), 640–652 (1994)

    Article  MathSciNet  MATH  Google Scholar 

  12. Hastie, T., Tibshirani, R.: Varying-coefficient models. J. R. Stat. Soc. Ser. B 55, 757–796 (1993)

    MathSciNet  MATH  Google Scholar 

  13. Verweij, P.J.M., van Houwelingen, H.C.: Time-dependent effects of fixed covariates in cox regression. Biometrics 51, 1550–1556 (1995)

    Article  MATH  Google Scholar 

  14. Berger, U., Schäer, J., Ulm, K.: Dynamic Cox modelling based on fractional polynomials: time-variations in gastric cancer prognosis. Stat. Med. 22 (7), 1163–1180 (2003)

    Article  Google Scholar 

  15. Perperoglou, A., le Cessie, S., van Houwelingen, H.C.: A fast routine for fitting Cox models with time varying effects of the covariates. Comput. Methods Prog. Biomed. 25, 154–161 (2006)

    Article  Google Scholar 

  16. Hunter, D.R., Lange, K.: A tutorial on MM algorithms. Am. Stat. 58, 30–37 (2004)

    Article  MathSciNet  Google Scholar 

  17. Lange, K., Hunter, D.R., Yang, I.: Optimization transfer using surrogate objective functions (with discussion). J. Comput. Graph. Stat. 9, 1–20 (2000)

    Google Scholar 

  18. Lange, K.: Optimization, 2nd edn. Springer Texts in Statistics. Springer, New York (2012)

    MATH  Google Scholar 

  19. Wu, T.T., Lange, K.: The MM alternative to EM. Stat. Sci. 29, 492–505 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  20. Duchateau, L., Janssen, P.: Springer Texts in Statistics. Springer, New York (2008)

    Google Scholar 

  21. Yang, Y., Zou, H.: A cocktail algorithm for solving the elastic net penalized cox’s regression in high dimensions. Stat. Interface 6, 167–173 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  22. Therneau, T.M., Grambsch, P.M.: Modeling Survival Data, Extending the Cox Model. Springer, New York (2000)

    Book  MATH  Google Scholar 

  23. Morris, J.S.: He BLUPs are not “best” when it comes to bootstrapping. Stat. Probab. Lett. 56, 425–430 (2002)

    Article  MATH  Google Scholar 

  24. Pencina, M.J., D’Agostino, R.B.: Overall C as a measure of discrimination in survival analysis: model specific population value and confidence interval estimation. Stat. Med. 23 (13), 2109–2023 (2004)

    Article  Google Scholar 

  25. Lee, Y., Nelder, J.A.: Hierarchical generalized linear models. J. R. Stat. Soc. Ser. B 58, 619–678 (1996)

    MathSciNet  MATH  Google Scholar 

  26. Jeon, J., Hsu, L., Gorfine, M.: Bias correction in the hierarchical likelihood approach to the analysis of multivariate survival data. Biostatistics 13 (3), 384–97 (2012)

    Article  MATH  Google Scholar 

Download references

Acknowledgements

This work was supported in part by Health Resources and Services Administration contract 234-2005-37011C. The content is the responsibility of the authors alone and does not necessarily reflect the views or policies of the Department of Health and Human Services, nor does mention of trade names, commercial products, or organizations imply endorsement by the US Government. Yi Li’s research is partly supported by the Chinese Natural Science Foundation (11528102).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yi Li .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this chapter

Cite this chapter

He, K., Li, Y., Wei, Q., Li, Y. (2017). A Computationally Efficient Approach for Modeling Complex and Big Survival Data. In: Ahmed, S. (eds) Big and Complex Data Analysis. Contributions to Statistics. Springer, Cham. https://doi.org/10.1007/978-3-319-41573-4_10

Download citation

Publish with us

Policies and ethics