Abstract
The paper describes a new, context-sensitive discretization algorithm that combines aspects of unsupervised (class-blind) and supervised methods. The algorithm is applicable to a wide range of machine learning and data mining problems where continuous attributes need to be discretized. In this paper, we evaluate its utility in a regression-by-classification setting. Preliminary experimental results indicate that the decision trees induced using this discretization strategy are significantly smaller and thus more comprehensible than those learned with standard discretization methods, while losing only minimally in numerical prediction accuracy. This may be a considerable advantage in machine learning and data mining applications where comprehensibility is an issue.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Dougherty J., Kohavi R., Sahami M.: Supervised and Unsupervised Discretization of Continuous Features. In Proceedings of the 12th International Conference on Machine Learning (ML95), Morgan Kaufmann, San Francisco, CA, 1995. 246
Fayyad U., Irani K.: Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning. In Proc. of the 13th International Joint Conference on Artificial Intelligence (IJCAI’93), Morgan Kaufmann, San Francisco, 1993. 246
Friedman N., Goldszmidt M., Lee T.J.: Bayesian Network Classification with Continuous Attributes: Getting the Best of Both Discretization and Parametric Fitting. In Proceedings of the 15th International Conference on Machine Learning (ICML’98), Morgan Kaufmann, San Francisco, CA, pp.179–187, 1998. 246
Jun B.H., Kim C.S., Song H.Y., Kim J.: A New Criterion in Selection and Discretization of Attributes for the Generation of Decision Trees, IEEE Transactions on Pattern Analysis and Machine Intelligence 19(12), 1371–1375, 1997.
Kerber R.: ChiMerge: Discretization of Numeric Attributes. In Proceedings of the 10th National Conference on Artificial Intelligence (AAAI’92), AAAI Press, Menlo Park, 1992. 246
Kohavi R., Sahami M.: Error-Based and Entropy-Based Discretization of Continuous Features. In KDD-96: Proceedings 2nd International Conference on Knowledge Discovery and Data Mining, AAAI Press, Menlo Park, CA, pp.114–119, 1996. 246
Ludl M.-C.: Relative Unsupervised Discretisation of Continuous Attributes. Master’s thesis, Department of Medical Cybernetics and Artificial Intelligence, University of Vienna, 2000 (forthcoming). 248, 249, 250
Pavlidis T.: Algorithms for Graphics and Image Processing, Computer Science Press, Inc., Rockville, Maryland USA, 1982. 249
Pfahringer B.: Compression-Based Discretization of Continuous Attributes, in Proceedings of the 12th International Conference on Machine Learning (ML95), Morgan Kaufmann, San Francisco, CA, 1995. 246
Quinlan, J.R.: C4.5: Programs for Machine Learning, Morgan Kaufmann, San Mateo, CA, 1993. 250
Richeldi M., Rossotto M.: Class-Driven Statistical Discretization of Continuous Attributes. In Machine Learning: ECML-95, Springer, Berlin, pp.335–338, 1995. 246
Torgo, M., Gama, J.: Regression Using Classification Algorithms. Intelligent Data Analysis 1, 275–292, 1997. 246, 247, 250
Wang K., Goh H.C.: Minimum Splits Based Discretization for Continuous Features. In Proceedings of the 15th International Joint Conference on Artificial Intelligence (IJCAI’97), Morgan Kaufmann, San Francisco, CA, pp.942–951, 1997. 246
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2000 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ludl, MC., Widmer, G. (2000). Relative Unsupervised Discretization for Regression Problems. In: López de Mántaras, R., Plaza, E. (eds) Machine Learning: ECML 2000. ECML 2000. Lecture Notes in Computer Science(), vol 1810. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45164-1_26
Download citation
DOI: https://doi.org/10.1007/3-540-45164-1_26
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-67602-7
Online ISBN: 978-3-540-45164-8
eBook Packages: Springer Book Archive