Skip to main content

Advertisement

Log in

Rationale and Applications of Survival Tree and Survival Ensemble Methods

  • Published:
Psychometrika Aims and scope Submit manuscript

Abstract

Classification and Regression Trees (CART), and their successors—bagging and random forests, are statistical learning tools that are receiving increasing attention. However, due to characteristics of censored data collection, standard CART algorithms are not immediately transferable to the context of survival analysis. Questions about the occurrence and timing of events arise throughout psychological and behavioral sciences, especially in longitudinal studies. The prediction power and other key features of tree-based methods are promising in studies where an event occurrence is the outcome of interest. This article reviews existing tree algorithms designed specifically for censored responses as well as recently developed survival ensemble methods, and introduces available computer software. Through simulations and a practical example, merits and limitations of these methods are discussed. Suggestions are provided for practical use.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  • Berk, R. A. (2008). Statistical learning from a regression perspective. New York, NY: Springer.

  • Breiman, L. (1996). Bagging predictors. Machine Learning, 24, 123–140.

    Google Scholar 

  • Breiman, L. (2001). Random forests. Machine Learning, 45, 5–32.

    Article  Google Scholar 

  • Breiman, L. (2002). Software for the masses. Department of Statistics, University of California, Berkeley. Retrieved from http://www.stat.berkeley.edu/~breiman/wald2002-3.pdf. Accessed 1 July 2014.

  • Breiman, L. (2003a). How to use survival forests. Department of Statistics, University of California, Berkeley. Retrieved from http://www.stat.berkeley.edu/~breiman/SF_Manual.pdf. Accessed 1 July 2014.

  • Breiman, L. (2003b). Manual—setting up, using and understanding random forests V4.0. Retrieved from http://www.stat.berkeley.edu/~breiman/Using_random_forests_v4.0.pdf. Accessed 1 July 2014.

  • Breiman, L., Friedman, J. H., Olshen, R., & Stone, C. J. (1984). Classification and regression trees. New York, NY: Chapman & Hall.

  • Butler, J., Gilpin, E., Gordon, L., & Olshen, R. (1989). Tree-structured survival analysis. II. Technical report, Department of Biostatistics, Stanford University.

  • Ciampi, A., Thiffault, J., Nakache, J. P., & Asselain, B. (1986). Stratification by stepwise regression, correspondence analysis and recursive partition: A comparison of three methods of analysis for survival data with covariates. Computational Statistics & Data Analysis, 4, 185–204.

    Article  Google Scholar 

  • Cox, D. R. (1972). Regression models and life tables. Journal of the Royal Statistical Society Series B, 34(2), 187–220.

  • Cox, D. R., & Oakes, D. (1984). Analysis of survival data. London: Chapman & Hall.

  • Davis, R., & Anderson, J. (1989). Exponential survival trees. Statistics in Medicine, 8, 947–961.

    Article  PubMed  Google Scholar 

  • DeWit, D. J., Adlaf, E. M., Offord, D. R., & Ogborne, A. C. (2000). Age at first alcohol use: A risk factor for the development of alcohol disorders. American Journal of Psychiatry, 157(5), 745–750.

  • Gordon, L., & Olshen, R. A. (1985). Tree-structured survival analysis. Cancer Treatment Reports, 69, 1065–1069.

    PubMed  Google Scholar 

  • Graf, E., Schmoor, C., Sauerbrei, W., & Schumacher, M. (1999). Assessment and comparison of prognostic classification schemes for survival data. Statistics in Medicine, 18, 2529–2545.

    Article  PubMed  Google Scholar 

  • Harrell, F., Califf, R., Pryor, D., Lee, K., & Rosati, R. (1982). Evaluating the yield of medical tests. Journal of the American Medical Association, 247, 2543–2546.

    Article  PubMed  Google Scholar 

  • Henning, K. R., & Frueh, B. C. (1996). Cognitive-behavioral treatment of incarcerated offenders: An evaluation of the Vermont Department of Corrections’ cognitive self-change program. Criminal Justice and Behavior, 23, 523–541.

  • Hothorn, T., Bühlmann, P., Dudoit, S., Molinaro, A., & van der Laan, M. J. (2006a). Survival ensembles. Biostatistics, 7(3), 355–373.

  • Hothorn, T., Hornik, K., Strobl, C., & Zeileis, A. (2010). Package ‘party’: A laboratory for recursive part(y)itioning (R package Version 0.9-9997) [Computer software]. Retrieved from http://cran.r-project.org/web/packages/party/index.html. Accessed 15 Oct 2010.

  • Hothorn, T., Hornik, K., & Zeileis, A. (2006b). Unbiased recursive partitioning: A conditional inference framework. Journal of Computational and Graphical Statistics, 15, 651–674.

    Article  Google Scholar 

  • Hothorn, T., Lausen, B., Benner, A., & Radespiel-Tröger, M. (2004). Bagging survival trees. Statistics in Medicine, 23, 77–91.

    Article  PubMed  Google Scholar 

  • Hothorn, T., & Zeileis, A. (2012). Package ‘partykit’: A Toolkit for Recursive Partytioning (R package Version 0.1-6) [Computer software]. Retrieved from http://cran.r-project.org/web/packages/partykit/index.html. Accessed 3 Sept 2013.

  • Intrator, O., & Kooperberg, C. (1995). Trees and splines in survival analysis. Statistical Methods in Medical Research, 4(3), 237–261.

  • Ishwaran, H., & Kogalur, U. B. (2010). Package ‘randomSurvivalForest’: Random survival forest. (R package Version 3.6.3) [Computer Software]. Retrieved from http://cran.r-project.org/web/packages/randomSurvivalForest/index.html. Accessed 15 Oct 2010.

  • Ishwaran, H., Kogalur, U. B., Blackstone, E. H., & Lauer, M. S. (2008). Random survival forests. The Annals of Applied Statistics, 2(3), 841–860.

  • Keleş, S., & Segal, M. R. (2002). Residual-based tree structured survival analysis. Statistics in Medicine, 21, 313–326.

  • LeBlanc, M., & Crowley, J. (1992). Relative risk trees for censored survival data. Biometrics, 48, 411–425.

    Article  PubMed  Google Scholar 

  • LeBlanc, M., & Crowley, J. (1993). Survival trees by goodness of split. Journal of the American Statistical Association, 88, 457–467.

    Article  Google Scholar 

  • Mantel, N. (1966). Evaluation of survival data and two new rank order statistics arising in its consideration. Cancer Chemotherapy Reports, 50(3), 163–170.

    Google Scholar 

  • Mertens, J. R., Kline-Simon, A. H., Delucchi, K. L., Moore, C., & Weisner, C. M. (2012). Ten-year stability of remission in private alcohol and drug outpatient treatment: Non-problem users versus abstainers. Drug and Alcohol Dependence, 125(1), 67–74.

  • McArdle, J. J. (2011). Exploratory data mining using CART in the behavioral sciences. In H. Cooper, P. Camic, D. Long, A. T. Panter, D. Rindskopf, & K. Sher (Eds.), APA handbook of research methods in psychology. Washington, DC: The American Psychological Association.

  • Molinaro, A. M., Dudoit, S., & van der Laan, M. J. (2004). Tree-based multivariate regression and density estimation with right-censored data. Journal of Multivariate Analysis, 90, 154–177.

  • Morgan, J. N., & Sonquist, J. A. (1963). Problems in the analysis of survey data, and a proposal. Journal of the American Statistical Association, 58, 415–434.

  • Morita, J. G., Lee, T. W., & Mowday, R. T. (1993). The regression-analog to survival analysis: A selected application to turnover research. Academy of Management Journal, 36(6), 1430–1464.

  • Peters, A., Hothorn, T., Ripley, B. D., Therneau, T., & Atkinson, B. (2009). Package ‘ipred’: Improved Predictors. (R package Version 0.9-3) [Computer Software]. Retrieved from http://cran.r-project.org/web/packages/ipred/index.html. Accessed 1 July 2014.

  • Peto, R., & Peto, J. (1972). Asymptotically efficient rank invariant test procedures. Journal of the Royal Statistical Society Series A, 135(2), 185–207.

    Article  Google Scholar 

  • Schemper, M., & Stare, J. (1996). Explained variation in survival analysis. Statistics in Medicine, 15, 1999–2012.

    Article  PubMed  Google Scholar 

  • Segal, M. R. (1988). Regression trees for censored data. Biometrics, 44, 35–47.

  • Schapire, R. E. (1999). A brief introduction to boosting. In Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence (IJCAI 99) (pp. 1401–1405).

  • Singer, J. D., & Willett, J. B. (1991). Modeling the days of our lives: Using survival analysis when designing and analyzing longitudinal studies of duration and the timing of events. Psychological Bulletin, 110(2), 268.

  • Singer, J. D., & Willett, J. B. (2003). Applied longitudinal data analysis. New York, NY: Oxford.

  • Stone, M. (1974). Choice and assessment of statistical predictions. Journal of the Royal Statistical Society Series B, 36, 111–133.

    Google Scholar 

  • Strobl, C., Malley, J., & Tutz, G. (2009). An introduction to recursive partitioning: Rational, application, and characteristics of classification and regression trees, bagging, and random forests. Psychological Methods, 14(4), 323–348.

    Article  PubMed Central  PubMed  Google Scholar 

  • Therneau, T. M., & Atkinson, B. (2010). Package ‘rpart’: Recursive partitioning (R package Version 3.1-48) [Computer software]. Retrieved from http://cran.r-project.org/web/packages/rpart/index.html. Accessed 15 Oct 2010.

  • Therneau, T. M., Grambsch, P. M., & Fleming, T. R. (1990). Martingale-based residuals for survival models. Biometrika, 77(1), 147–160.

  • Zhang, H. P., & Singer, B. (1999). Recursive partitioning in the health sciences. New York, NY: Springer.

  • Zhou, Y., Kadlec, K. M., & McArdle, J. J. (2014). Predicting mortality from demographics and specific cognitive abilities in the Hawaii Family Study of Cognition. In J. J. McArdle & G. Ritschard (Eds.), Contemporary issues in exploratory data mining (pp. 429–449). New York, NY: Routledge.

  • Zosuls, K. M., Ruble, D. N., Tamis-LeMonda, C. S., Shrout, P. E., Bornstein, M. H., & Greulich, F. K. (2009). The acquisition of gender labels in infancy: Implications for gender-typed play. Developmental Psychology, 45(3), 688.

Download references

Acknowledgments

This study was supported by National Science Foundation SES-1124283. We thank David Elashoff (UCLA) for his comments on an earlier draft of this work.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yan Zhou.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (docx 15 KB)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhou, Y., McArdle, J.J. Rationale and Applications of Survival Tree and Survival Ensemble Methods. Psychometrika 80, 811–833 (2015). https://doi.org/10.1007/s11336-014-9413-1

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11336-014-9413-1

Keywords

Navigation