Bagging Soft Decision Trees

  • Olcay Taner Yıldız
  • Ozan İrsoy
  • Ethem Alpaydın
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9605)


The decision tree is one of the earliest predictive models in machine learning. In the soft decision tree, based on the hierarchical mixture of experts model, internal binary nodes take soft decisions and choose both children with probabilities given by a sigmoid gating function. Hence for an input, all the paths to all the leaves are traversed and all those leaves contribute to the final decision but with different probabilities, as given by the gating values on the path. Tree induction is incremental and the tree grows when needed by replacing leaves with subtrees and the parameters of the newly-added nodes are learned using gradient-descent. We have previously shown that such soft trees generalize better than hard trees; here, we propose to bag such soft decision trees for higher accuracy. On 27 two-class classification data sets (ten of which are from the medical domain), and 26 regression data sets, we show that the bagged soft trees generalize better than single soft trees and bagged hard trees. This contribution falls in the scope of research track 2 listed in the editorial, namely, machine learning algorithms.


Decision trees Regression trees Regularization Bagging 



This work is partially supported by Boğaziçi University Research Funds with Grant Number 14A01P4.


  1. 1.
    Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and Regression Trees. John Wiley and Sons, New York (1984)zbMATHGoogle Scholar
  2. 2.
    Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Meteo (1993)Google Scholar
  3. 3.
    Murthy, S.K., Kasif, S., Salzberg, S.: A system for induction of oblique decision trees. J. Artif. Intell. Res. 2, 1–32 (1994)zbMATHGoogle Scholar
  4. 4.
    Yıldız, O.T., Alpaydın, E.: Linear discriminant trees. Int. J. Pattern Recogn. Artif. Intell. 19(3), 323–353 (2005)CrossRefGoogle Scholar
  5. 5.
    Guo, H., Gelfand, S.B.: Classification trees with neural network feature extraction. IEEE Trans. Neural Netw. 3, 923–933 (1992)CrossRefGoogle Scholar
  6. 6.
    Yıldız, O.T., Alpaydın, E.: Omnivariate decision trees. IEEE Trans. Neural Netw. 12(6), 1539–1546 (2001)CrossRefGoogle Scholar
  7. 7.
    Jordan, M.I., Jacobs, R.A.: Hierarchical mixtures of experts and the EM algorithm. Neural Comput. 6, 181–214 (1994)CrossRefGoogle Scholar
  8. 8.
    İrsoy, O., Yıldız, O.T., Alpaydın, E.: Soft decision trees. In: Proceedings of the International Conference on Pattern Recognition, Tsukuba, Japan, pp. 1819–1822 (2012)Google Scholar
  9. 9.
    Breiman, L.: Bagging predictors. Mach. Learn. 26, 123–140 (1996)zbMATHGoogle Scholar
  10. 10.
    Ruta, A., Li, Y.: Learning pairwise image similarities for multi-classification using kernel regression trees. Pattern Recogn. 45, 1396–1408 (2011)CrossRefGoogle Scholar
  11. 11.
    Yıldız, O.T., Alpaydın, E.: Regularizing soft decision trees. In: Proceedings of the International Conference on Computer and Information Sciences, Paris, France (2013)Google Scholar
  12. 12.
    Ulaş, A., Semerci, M., Yıldız, O.T., Alpaydın, E.: Incremental construction of classifier and discriminant ensembles. Inf. Sci. 179, 1298–1318 (2009)CrossRefGoogle Scholar
  13. 13.
    Demsar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)MathSciNetzbMATHGoogle Scholar
  14. 14.
    Blake, C., Merz, C.: UCI repository of machine learning databases (2000)Google Scholar
  15. 15.
    Kulp, D., Haussler, D., Reese, M.G., Eeckman, F.H.: A generalized hidden markov model for the recognition of human genes in dna. In: International Conference on Intelligent Systems for Molecular Biology (1996)Google Scholar
  16. 16.
    Liu, L., Han, H., Li, J., Wong, L.: An in-silico method for prediction of polyadenylation signals in human sequences. In: International Conference on Genome Informatics (2003)Google Scholar
  17. 17.
    Rasmussen, C.E., Neal, R.M., Hinton, G., van Camp, D., Revow, M., Ghahramani, Z., Kustra, R., Tibshirani, R.: Delve data for evaluating learning in valid experiments (1996)Google Scholar
  18. 18.
    Ulaş, A., Yıldız, O.T., Alpaydın, E.: Eigenclassifiers for combining correlated classifiers. Inf. Sci. 187, 109–120 (2012)MathSciNetCrossRefGoogle Scholar
  19. 19.
    Ho, T.K.: The random subspace method for constructing decision forests. IEEE Trans. Pattern Anal. Mach. Intell. 20, 832–844 (1998)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2016

Authors and Affiliations

  • Olcay Taner Yıldız
    • 1
  • Ozan İrsoy
    • 2
  • Ethem Alpaydın
    • 3
  1. 1.Department of Computer EngineeringIşık UniversityİstanbulTurkey
  2. 2.Department of Computer ScienceCornell UniversityIthacaUSA
  3. 3.Department of Computer EngineeringBoğaziçi UniversityİstanbulTurkey

Personalised recommendations