Decision Tree and Ensemble Learning Algorithms with Their Applications in Bioinformatics

  • Dongsheng Che
  • Qi Liu
  • Khaled Rasheed
  • Xiuping Tao
Part of the Advances in Experimental Medicine and Biology book series (AEMB, volume 696)


Machine learning approaches have wide applications in bioinformatics, and decision tree is one of the successful approaches applied in this field. In this chapter, we briefly review decision tree and related ensemble algorithms and show the successful applications of such approaches on solving biological problems. We hope that by learning the algorithms of decision trees and ensemble classifiers, biologists can get the basic ideas of how machine learning algorithms work. On the other hand, by being exposed to the applications of decision trees and ensemble algorithms in bioinformatics, computer scientists can get better ideas of which bioinformatics topics they may work on in their future research directions. We aim to provide a platform to bridge the gap between biologists and computer scientists.


Decision Tree Random Forest Base Classifier Ensemble Classifier Decision Tree Algorithm 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Baldi, P. and Brunak, S. (2001) Bioinformatics: The Machine Learning Approach (Adaptive Computation and Machine Learning), Second Edition. MIT, Cambridge, MAGoogle Scholar
  2. 2.
    Bhaskar, H., Hoyle, D.C. and Singh, S. (2006) Machine learning in bioinformatics: A brief survey and recommendations for practitioners, Computers in Biology and Medicine, 36, 1104–1125PubMedCrossRefGoogle Scholar
  3. 3.
    Breiman, L., Friedman, J., Stone, C. and Olshen, R.A. (1984) Classification and Regression Trees. Chapman & Hall/CRC, New York, NYGoogle Scholar
  4. 4.
    Brieman, L. (1996) Bagging predictors, Machine Learning, 24, 123–140Google Scholar
  5. 5.
    Brieman, L. (2001) Random forests, Machine Learning, 45, 5–32CrossRefGoogle Scholar
  6. 6.
    Che, D., Zhao, J., Cai, L. and Xu, Y. (2007) Operon prediction in microbial genomes using decision tree approach. In Proceedings of CIBCB. Honolulu, 135–142Google Scholar
  7. 7.
    David, H.W. (1992) Stacked generalization, Neural Networks, 5, 241–259CrossRefGoogle Scholar
  8. 8.
    Diaz-Uriarte, R. and Alvarez de Andres, S. (2006) Gene selection and classification of microarray data using random forest, BMC Bioinformatics, 7, 3PubMedCrossRefGoogle Scholar
  9. 9.
    Freund, Y. and Schapire, R. (1995) A decision-theoretic generalization of on-line learning and an application to boosting. In Proceedings of the Second European Conference on Computational Learning Theory. Springer, Berlin, 23–37Google Scholar
  10. 10.
    Ge, G. and Wong, G.W. (2008) Classification of premalignant pancreatic cancer mass-spectrometry data using decision tree ensembles, BMC Bioinformatics, 9, 275PubMedCrossRefGoogle Scholar
  11. 11.
    Larranaga, P., Calvo, B., Santana, R., Bielza, C., Galdiano, J., Inza, I., Lozano, J.A., Armananzas, R., Santafe, G., Perez, A. and Robles, V. (2006) Machine learning in bioinformatics, Briefings in Bioinformatics, 7, 86–112PubMedCrossRefGoogle Scholar
  12. 12.
    Middendorf, M., Kundaje, A., Wiggins, C., Freund, Y. and Leslie, C. (2004) Predicting genetic regulatory response using classification, Bioinformatics, 20 Suppl 1, i232–240PubMedCrossRefGoogle Scholar
  13. 13.
    Qu, Y., Adam, B.L., Yasui, Y., Ward, M.D., Cazares, L.H., Schellhammer, P.F., Feng, Z., Semmes, O.J. and Wright, G.L., Jr. (2002) Boosted decision tree analysis of surface-enhanced laser desorption/ionization mass spectral serum profiles discriminates prostate cancer from noncancer patients, Clinical Chemistry, 48, 1835–1843PubMedGoogle Scholar
  14. 14.
    Quinlan, J.R. (1986) Induction of decision trees, Machine Learning, 1, 81–106Google Scholar
  15. 15.
    Quinlan, J.R. (1993) C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers, San Mateo, CAGoogle Scholar
  16. 16.
    Salzberg, S., Delcher, A.L., Fasman, K.H. and Henderson, J. (1998) A decision tree system for finding genes in DNA, Journal of Computational Biology, 5, 667–680PubMedCrossRefGoogle Scholar
  17. 17.
    Statnikov, A., Wang, L. and Aliferis, C.F. (2008) A comprehensive comparison of random forests and support vector machines for microarray-based cancer classification, BMC Bioinformatics, 9, 319PubMedCrossRefGoogle Scholar
  18. 18.
    Tan, A.C. and Gilbert, D. (2003) Ensemble machine learning on gene expression data for cancer classification, Applied Bioinformatics, 2, S75–83PubMedGoogle Scholar
  19. 19.
    Vlahou, A., Schorge, J.O., Gregory, B.W. and Coleman, R.L. (2003) Diagnosis of ovarian cancer using decision tree classification of mass spectral data, Journal of Biomedicine and Biotechnology, 2003, 308–314PubMedCrossRefGoogle Scholar
  20. 20.
    Wu, B., Abbott, T., Fishman, D., McMurray, W., Mor, G., Stone, K., Ward, D., Williams, K. and Zhao, H. (2003) Comparison of statistical methods for classification of ovarian cancer using mass spectrometry data, Bioinformatics, 19, 1636–1643PubMedCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2011

Authors and Affiliations

  • Dongsheng Che
    • 1
  • Qi Liu
  • Khaled Rasheed
  • Xiuping Tao
  1. 1.Department of Computer ScienceEast Stroudsburg UniversityEast StroudsburgUSA

Personalised recommendations