Regression by classification

  • Luís Torgo
  • João Gama
Planning, Learning and Heuristic Search
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1159)


We present a methodology that enables the use of existent classification inductive learning systems on problems of regression. We achieve this goal by transforming regression problems into classification problems. This is done by transforming the range of continuous goal variable values into a set of intervals that will be used as discrete classes. We provide several methods for discretizing the goal variable values. These methods are based on the idea of performing an iterative search for the set of final discrete classes. The search algorithm is guided by a N-fold cross validation estimation of the prediction error resulting from using a set of discrete classes. We have done extensive empirical evaluation of our discretization methodologies using C4.5 and CN2 on four real world domains. The results of these experiments show the quality of our discretization methods compared to other existing methods.

Our method is independent of the used classification inductive system. The method is easily applicable to other inductive algorithms. This generality turns our method into a powerful tool that extends the applicability of a wide range of existing classification systems.


learning regression classification discretization methods 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. [1].
    Breiman,L., Friedman,J.H., Olshen,R.A. & Stone,C.J. (1984): Classification and Regression Trees, Wadsworth Int. Group, Belmont, California, USA, 1984.Google Scholar
  2. [2].
    Clark, P. and Niblett, T. (1988): The CN2 induction algorithm. In Machine Learning, 3, 261–283.Google Scholar
  3. [3].
    Dillon,W. and Goldstein,M. (1984): Multivariale Analysis. John Wiley & Sons, Inc.Google Scholar
  4. [4].
    Fayyad, U.M., and Irani, K.B. (1993): Multi-interval Discretization of Continuous-valued Attributes for Classification Learning. In Proceedings of the 13th International Joint Conference on Artificial Intelligence (IJCAI-93). Morgan Kaufmann Publishers.Google Scholar
  5. [5].
    Friedman, J. (1991): Multivariate Adaptative Regression Splines, In Annals of Statistics, 19:1.Google Scholar
  6. [6].
    Ginsberg, M. (1993): Essentials of Artificial Intelligence. Morgan Kaufmann Publishers.Google Scholar
  7. [7].
    Holland, J. (1992): Adaptation in natural and artificial systems: an introductory analysis with applications to biology, control and artificial intelligence. MIT Press.Google Scholar
  8. [8].
    John,G.H., Kohavi,R. and Pfleger, K. (1994): Irrelevant features and the subset selection problem. In Machine Learning: proceedings of the 11th International Conference. Morgan Kaufmann.Google Scholar
  9. [9].
    Kohavi, R. (1995): Wrappers for performance enhancement and oblivious decision graphs. PhD Thesis.Google Scholar
  10. [10].
    Langley, Pr., and Sage, S. (1994): Induction of selective bayesian classifiers. In Proceedings of the 10th conference on Uncertainty in Artificial Intelligence. Morgan Kaufmann Publishers.Google Scholar
  11. [11].
    Lee, C. and Shin, D. (1994): A context-sensitive Discretization of Numeric Attributes for classification learning. In Proceedings of the 11th European Conference on Artificial Intelligence (ECAI-94), Cohn, A.G. (ed.). John Wiley & Sons.Google Scholar
  12. [12].
    Michie,D., Spiegelhalter,D.J. & Taylor,C.C. (1994): Machine Learning, Neural and Statistical Classification, Ellis Horwood Series in Artificial Intelligence, 1994.Google Scholar
  13. [13].
    Mladenic, D. (1995): Automated model selection. In Mlnet workshop on Knowledge Level Modelling and Machine Learning. Heraklion, Crete, Greece.Google Scholar
  14. [14].
    Pazzani, M.J. (1995): Searching for dependencies in bayesian classifiers. In Proceedings of the 5th international workshop on Artificial Intelligence and Statitics. Ft. Laurderdale, FL.Google Scholar
  15. [15].
    Quinlan, J. R. (1993): C4.5: programs for machine learning. Morgan Kaufmann Publishers.Google Scholar
  16. [16].
    Quinlan, J.R. (1992): Learning with Continuos Classes. In Proceedings of the 5th Australian Joint Conference on Artificial Intelligence. Singapore: World Scientific, 1992.Google Scholar
  17. [17].
    Torgo, L. (1995): Data Fitting with Rule-based Regression. In Proceedings of the 2nd international workshop on Artificial Intelligence Techniques (AIT95), Zizka,J. and Brazdil,P. (eds.). Brno, Czech Republic.Google Scholar
  18. [18].
    van Laarhoven,P. and Aarts,E. (1987): Simulated annealing: Theory and Applications. Kluwer Academic Publishers.Google Scholar
  19. [19].
    Weiss, S. and Indurkhya, N. (1993): Rule-base Regression. In Proceedings of the 13th International Joint Conference on Artificial Intelligence, pp. 1072–1078.Google Scholar
  20. [20].
    Weiss, S. and Indurkhya, N. (1995): Rule-based Machine Learning Methods for Functional Prediction. In Journal Of Artificial Intelligence Research (JAIR), volume 3, pp.383–403.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1996

Authors and Affiliations

  • Luís Torgo
    • 1
  • João Gama
    • 1
  1. 1.LIACC-University of PortoPortoPortugal

Personalised recommendations