Improving SVM-Linear Predictions Using CART for Example Selection

  • João M. Moreira
  • Alípio M. Jorge
  • Carlos Soares
  • Jorge Freire de Sousa
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4203)


This paper describes the study on example selection in regression problems using μ-SVM (Support Vector Machine) linear as prediction algorithm. The motivation case is a study done on real data for a problem of bus trip time prediction. In this study we use three different training sets: all the examples, examples from past days similar to the day where prediction is needed, and examples selected by a CART regression tree. Then, we verify if the CART based example selection approach is appropriate on different regression data sets. The experimental results obtained are promising.


Leaf Node Induction Algorithm Wrap Approach Wisconsin Breast Cancer Boston Housing 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Blum, A.L., Langley, P.: Selection of relevant features and examples in machine learning. Artificial Intelligence 97(1-2), 245–271 (1997)MATHCrossRefMathSciNetGoogle Scholar
  2. 2.
    Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and regression trees. Chapman and Hall/CRC (1984)Google Scholar
  3. 3.
    Cardie, C.: Using decision trees to improve case-based learning. In: 10th International conference on machine learning, pp. 25–32. Morgan Kaufmann, San Francisco (1993)Google Scholar
  4. 4.
    Kohavi, R., John, G.H.: Wrappers for feature subset selection. Artificial Intelligence 97(1-2), 273–324 (1997)MATHCrossRefGoogle Scholar
  5. 5.
    Liu, H., Motoda, H.: On issues of instance selection. Data Mining and Knowledge Discovery 6(2), 115–130 (2002)CrossRefMathSciNetGoogle Scholar
  6. 6.
    Moreira, J.M., Jorge, A., Sousa, J.F., Soares, C.: Trip time prediction in mass transit companies. a machine learning approach. In: 10th EWGT, pp. 276–283 (2005)Google Scholar
  7. 7.
    Scholkopf, B., Smola, A.J., Williamson, R., Bartlett, P.: New support vector algorithms. Technical Report NC2-TR-1998-031 (1998)Google Scholar
  8. 8.
    Smola, A.J., Scholkopf, B.: A tutorial on support vector regression. Technical Report NC2-TR-1998-030 (1998)Google Scholar
  9. 9.
    Syed, N.A., Liu, H., Sung, K.K.: A study of support vectors on model independent example selection. In: 5th ACM SIGKDD, pp. 272–276 (1999)Google Scholar
  10. 10.
    Team, R.D.C.: R: A language and environment for statistical computing. Technical report, R Foundation for Statistical Computing (2004)Google Scholar
  11. 11.
    Torgo, L.: Regression data repository,

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • João M. Moreira
    • 1
  • Alípio M. Jorge
    • 2
  • Carlos Soares
    • 2
  • Jorge Freire de Sousa
    • 2
  1. 1.Faculty of EngineeringUniversity of PortoPortugal
  2. 2.Faculty of Economics, LIACCUniversity of PortoPortugal

Personalised recommendations