Learning Model Trees from Data Streams

  • Elena Ikonomovska
  • Joao Gama
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5255)


In this paper we propose a fast and incremental algorithm for learning model trees from data streams (FIMT) for regression problems. The algorithm is incremental, works online, processes examples once at the speed they arrive, and maintains an any-time regression model. The leaves contain linear-models trained online from the examples that fall at that leaf, a process with low complexity. The use of linear models in the leaves increases its any-time global performance. FIMT is able to obtain competitive accuracy with batch learners even for medium size datasets, but with better training time in an order of magnitude. We study the properties of FIMT over several artificial and real datasets and evaluate its sensitivity on the order of examples and the noise level.


Data Stream Model Tree Numerical Attribute Multivariate Adaptive Regression Spline Incremental Learner 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Gratch, J.: Sequential Inductive Learning. In: 13th National Conference on Artificial Intelligence, pp. 779–786. AAAI Press, Menlo Park (1996)Google Scholar
  2. 2.
    Domingos, P., Hulten, G.: Mining High Speed Data Streams. In: 6th International Conference on Knowledge Discovery and Data Mining, pp. 71–80. ACM Press, New York (2000)Google Scholar
  3. 3.
    Quinlan, J.R.: Learning with Continuous Classes. In: 5th Australian Joint Conference on Artificial Intelligence, pp. 34–348. Adams & Sterling (1992)Google Scholar
  4. 4.
    Karalic, A.: Employing Linear Regression in Regression Tree Leaves. In: 10th European Conference on Artificial Intelligence, pp. 440–441. John Wiley & Sons, Chichester (1992)Google Scholar
  5. 5.
    Potts, D., Sammut, C.: Incremental Learning of Linear Model Trees. J. Machine Learning 61, 5–48 (2005)zbMATHCrossRefGoogle Scholar
  6. 6.
    Siciliano, R., Mola, F.: Modeling for Recursive Partitioning and Variable Selection. In: Computational Statistics, pp. 172–177. R. Dutter & W. Grossmann (1994)Google Scholar
  7. 7.
    Musick, R., Catlett, J., Russell, S.: Decision Theoretic Sub-sampling for Induction on Large Databases. In: 10th International Conference on Machine Learning, pp. 212–219. Morgan Kaufmann, San Francisco (1993)Google Scholar
  8. 8.
    Gama, J., Rocha, R., Medas, P.: Accurate Decision Trees for Mining High-Speed Data Streams. In: The 9th International Conference on Knowledge Discovery and Data Mining, pp. 52–528. KDD Press (2003)Google Scholar
  9. 9.
    Hulten, G., Domingos, P.: VFML – A toolkit for mining high-speed time-changing data streams (2003),
  10. 10.
    Angluin, D., Valiant, L.G.: Fast Probabilistic Algorithms for Hamiltonian Circuits and Matchings. J. Computer and System Sciences 19, 155–193 (1979)MathSciNetCrossRefGoogle Scholar
  11. 11.
    Friedman, J.H.: Multivariate Adaptive Regression Splines. J. The Annals of Statistics 19, 1–141 (1991)zbMATHCrossRefGoogle Scholar
  12. 12.
    Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and Regression Trees. Chapman & Hall/CRC, Belmont (1984)zbMATHGoogle Scholar
  13. 13.
    Dobra, A., Gehrke, J.: SECRET: A Scalable Linear Regression Tree Algorithm. In: 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 481–487. ACM Press, New York (2001)Google Scholar
  14. 14.
    Schaal, S., Atkeson, C.: Constructive Incremental Learning From only Local Information. J. Neural Computation 10, 2047–2084 (1998)CrossRefGoogle Scholar
  15. 15.
    Blake, C., Keogh, E., Merz, C.: UCI Repository of Machine Learning Databases (1999)Google Scholar
  16. 16.
    Breiman, L.: Arcing Classifiers. J. The Annals of Statistics. 26(3), 801–849 (1998)MathSciNetzbMATHCrossRefGoogle Scholar
  17. 17.
    Geman, S., Bienenstock, E., Doursat, R.: Neural Networks and the Bias/Variance Dilemma. J. Neural Computation 4, 1–58 (1992)CrossRefGoogle Scholar

Copyright information

© Springer Berlin Heidelberg 2008

Authors and Affiliations

  • Elena Ikonomovska
    • 1
  • Joao Gama
    • 2
  1. 1.FEIT – Ss. Cyril and Methodius UniversitySkopjeMacedonia
  2. 2.LIAAD/INESC, FEP – University of PortoPortoPortugal

Personalised recommendations