Semi-random Model Tree Ensembles: An Effective and Scalable Regression Method

  • Bernhard Pfahringer
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7106)


We present and investigate ensembles of semi-random model trees as a novel regression method. Such ensembles combine the scalability of tree-based methods with predictive performance rivalling the state of the art in numeric prediction. An empirical investigation shows that Semi-Random Model Trees produce predictive performance which is competitive with state-of-the-art methods like Gaussian Processes Regression or Additive Groves of Regression Trees. The training and optimization of Random Model Trees scales better than Gaussian Processes Regression to larger datasets, and enjoys a constant advantage over Additive Groves of the order of one to two orders of magnitude.


regression ensembles supervised learning randomization 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Asuncion, A., Newman, D.J.: UCI Machine Learning Repository. University of California, School of Information and Computer Science, Irvine, CA (2007)Google Scholar
  2. 2.
    Breiman, L.: Bagging predictors. Machine Learning 24(2), 123–140 (1996)zbMATHGoogle Scholar
  3. 3.
    Breiman, L.: Random Forests. Machine Learning 45(1), 5–32 (2001)MathSciNetCrossRefzbMATHGoogle Scholar
  4. 4.
    Fan, W., McClosky, J., Yu, P.S.: A General Framework for Accuracy and Fast Regression by Data Summarization in Random Decision Trees. In: KDD 2006, Philadelphia (August 2006)Google Scholar
  5. 5.
    Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA Data Mining Software: An Update. SIGKDD Explorations 11(1) (2009)Google Scholar
  6. 6.
    Hoerl, A.E., Kennard, R.W.: Ridge regression: Biased estimation for nonorthogonal problems. Technometrics 12(3), 55–67 (1970)CrossRefzbMATHGoogle Scholar
  7. 7.
    Quinlan, J.R.: Simplifying decision trees. International Journal of Man-Machine Studies 27, 221–234 (1987)CrossRefGoogle Scholar
  8. 8.
    Quinlan, J.R.: Learning with continuous classes. In: Proceedings Australian Joint Conference on Artificial Intelligence, pp. 343–348 (1992)Google Scholar
  9. 9.
    Rasmussen, C.E., Williams, C.K.I.: Gaussian Processes for Machine Learning. MIT Press (2006)Google Scholar
  10. 10.
    Saad, Y.: Iterative Methods for Sparse Linear Systems. Society for Industrial and Applied Mathematics (2003)Google Scholar
  11. 11.
    Sorokina, D., Caruana, R., Riedewald, M.: Additive Groves of Regression Trees. In: Kok, J.N., Koronacki, J., Lopez de Mantaras, R., Matwin, S., Mladenič, D., Skowron, A. (eds.) ECML 2007. LNCS (LNAI), vol. 4701, pp. 323–334. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  12. 12.
    Tibshirani, R.J.: Fast Computation of the Median by Successive Binning (2008) (unpublished manuscript),
  13. 13.
  14. 14.
    Wang, Y., Witten, I.H.: Induction of Model Trees for Predicting Continuous Classes. In: van Someren, M., Widmer, G. (eds.) ECML 1997. LNCS, vol. 1224, pp. 128–137. Springer, Heidelberg (1997)Google Scholar
  15. 15.
    Wold, H.: Soft Modeling by Latent Variables; the Nonlinear Iterative Partial Least Squares Approach. In: Gani, J. (ed.) Perspectives in Probability and Statistics, Papers in Honour of M.S. Bartlett, pp. 520–540. Academic Press (1975)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Bernhard Pfahringer
    • 1
  1. 1.University of WaikatoNew Zealand

Personalised recommendations