Skip to main content

Tree-Based Methods

  • Chapter
  • First Online:
High-Dimensional Data Analysis in Cancer Research

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Alvarez, S., Diaz-Uriarte, R., Osorio, A., Barroso, A., Melchor, L., Paz, M. F., Honrado, E., Rodriguez, R., Urioste, M., Valle, L., Diez, O., Cigudosa, J. C., Dopazo, J., Esteller, M., and Benitez, J. (2005). A predictor based on the somatic genomic changes of the brca1/brca2 breast cancer tumors identifies the non-brca1/brca2 tumors with brca1 promoter hypermethylation. Clinical Cancer Research, 11 (3):1146–1153.

    PubMed  CAS  Google Scholar 

  • Breiman, L. (1996). Bagging predictors. Machine Learning, 26(2):123–140.

    Google Scholar 

  • Breiman, L. (2001). Random forests. Machine Learning, 45(1):5–32.

    Article  Google Scholar 

  • Breiman, L., Friedman, J., Olshen, R., and Stone, C. (1984). Classification and Regression Trees. Wadsworth, Boca Raton, FL.

    Google Scholar 

  • Bureau, A., Dupuis, J., Falls, K., Lunetta, K. L., Hayward, B., Keith, T. P., and Eerdewegh, P. V. (2005). Identifying snps predictive of phenotype using random forests. Genetic Epidemiology, 28(2):171–182.

    Article  PubMed  Google Scholar 

  • Cutler, A. and Stevens, J. R. (2006). Random forests for microarrays. In Kimmel, A. and Oliver, B., editors, DNA Microarrays, Part B: Databases and Statistics, Volume 411 (Methods in Enzymology). Academic Press, San Diego, CA.

    Google Scholar 

  • Dettling, M. (2004). Bagboosting for tumor classification with gene expression data. Bioinformatics, 20(18):3583–3593.

    Article  PubMed  CAS  Google Scholar 

  • Dettling, M. and Buhlmann, P. (2003). Boosting for tumor classification with gene expression data. Bioinformatics, 19(9):1061–1069.

    Article  PubMed  CAS  Google Scholar 

  • Diaz-Uriarte, R. and Alvarez de Andres, S. (2006). Gene selection and classification of microarray data using random forest. BMC Bioinformatics, 7(1):3.

    Article  PubMed  Google Scholar 

  • Dietterich, T. G. (2000). An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, boosting, and randomization. Machine Learning, 40(2):139–157.

    Article  Google Scholar 

  • Dudoit, S., Fridlyand, J., and Speed, T. (2002). Comparison of discrimination methods for the classification of tumors using gene expression data. Journal of the American Statistical Association, 97(457):77–87.

    Article  CAS  Google Scholar 

  • Freund, Y. and Schapire, R. E. (1996). Experiments with a new boosting algorithm. In International Conference on Machine Learning, pp. 148–156.

    Google Scholar 

  • Friedman, J. (1991). Multivariate adaptive regression splines (with discussion). Annals of Statistics, 19(1):1–141.

    Article  Google Scholar 

  • Friedman, J. (2001). Greedy function approximation: a gradient boosting machine. Annals of Statistics, 29(5):1189–1232.

    Article  Google Scholar 

  • Friedman, J. (2002). Stochastic gradient boosting. Computational Statistics and Data Analysis, 38(4):367–378.

    Article  Google Scholar 

  • Friedman, J., Hastie, T., and Tibshirani, R. (2000). Additive logistic regression: a statistical view of boosting (with discussion). Annals of Statistics, 28(2):337–407.

    Article  Google Scholar 

  • Hastie, T., Tibshirani, R., and Friedman, J. (2001). The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer-Verlag, New York.

    Google Scholar 

  • Heidema, A. G., Boer, J. M., Nagelkerke, N., Mariman, E. C., van der A, D. L., and Feskens, E. J. (2006). The challenge for genetic epidemiologists: how to analyze large numbers of snps in relation to complex diseases. BMC Genetics, 7(23).

    Google Scholar 

  • Huang, Y., Li, H., Hu, H., Yan, X., Waterman, M., Huang, H., and Zhou, X.J. (2007). Systematic discovery of functional modules and context-specific functional annotation of human genome. Bioinformatics, 23:222–229.

    Article  Google Scholar 

  • Lee, J. W., Lee, J. B., Park, M., and Song, S. H. (2005). An extensive comparison of recent classification tools applied to microarray data. Computational Statistics & Data Analysis, 48(4):869–885.

    Article  Google Scholar 

  • Liaw, A. and Wiener, M. (2002). Classification and regression by randomforest. R News, 2(3):18–22.

    Google Scholar 

  • Munro, N. P., Cairns, D. A., Clarke, P., Rogers, M., Stanley, A. J., Barrett, J. H., Harnden, P., Thompson, D., Eardley, I., Banks, R. E., and Knowles, M. A. (2006). Urinary biomarker profiling in transitional cell carcinoma. International Journal of Cancer, 119(11):2642–2650.

    Article  CAS  Google Scholar 

  • Pang, H., Lin, A., Holford, M., Enerson, B., Lu, B., Lawton, M., Floyd, E., and Zhao, H. (2006). Pathway analysis using random forests classification and regression. Bioinformatics, 22(16):2028–2036.

    Article  PubMed  CAS  Google Scholar 

  • R Development Core Team. (2007). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0.

    Google Scholar 

  • Ridgeway, G. (2007). gbm: Generalized Boosted Regression Models. R package version 1.6-3.

    Google Scholar 

  • Shi, T., Seligson, D., Belldegrun, A., Palotie, A., and Horvath, S. (2005). Tumor classification by tissue microarray profiling: random forest clustering applied to renal cell carcinoma. Modern Pathology, 18:547–557.

    Article  PubMed  CAS  Google Scholar 

  • Singh, D., Febbo, P., Ross, K., Jackson, D., Manola, J., Ladd, C., Tamayo, P., Renshaw, A., D’Amico, A., Richie, J., Lander, E., Loda, M., Kantoff, P., Golub, T., and Sellers, W. (2002). Gene expression correlates of clinical prostate cancer behavior. Cancer Cell, 1(2):203–209.

    Article  PubMed  CAS  Google Scholar 

  • Stamey, T., Kabalin, J., McNeal, J., Johnstone, I., Freiha, F., Redwine, E., and Yang, N. (1989). Prostate specific antigen in the diagnosis and treatment of adenocarcinoma of the prostate. ii. radical prostatectomy treated patients. Journal of Urology, 16:1076–1083.

    Google Scholar 

  • Therneau, T. M. and Atkinson., B. (2007). rpart: Recursive Partitioning. R port by Brian Ripley. R package version 3.1–36.

    Google Scholar 

  • Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. Series B, 58:267–288.

    Google Scholar 

  • Wu, B., Abbot, T., Fishman, D., McMurray, W., Mor, G., Stone, K., Ward, D., Williams, K., and Zhao, H. (2003). Comparison of statistical methods for classification of ovarian cancer using mass spectrometry data. Bioinformatics, 19(13):1636–1643.

    Article  PubMed  CAS  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Adele Cutler .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer Science+Business Media, LLC

About this chapter

Cite this chapter

Cutler, A., Cutler, D.R., Stevens, J.R. (2009). Tree-Based Methods. In: Li, X., Xu, R. (eds) High-Dimensional Data Analysis in Cancer Research. Applied Bioinformatics and Biostatistics in Cancer Research. Springer, New York, NY. https://doi.org/10.1007/978-0-387-69765-9_5

Download citation

Publish with us

Policies and ethics