Abstract
Rule induction from data with numerical attributes must be accompanied by discretization. Our main objective was to compare two discretization techniques, both based on cluster analysis, with a new rule induction algorithm called MLEM2, in which discretization is performed simultaneously with rule induction. The MLEM2 algorithm is an extension of the existing LEM2 rule induction algorithm, working correctly only for symbolic attributes and being a part of the LERS data mining system. For the two strategies, based on cluster analysis, rules were induced by the LEM2 algorithm. Our results show that MLEM2 outperformed both strategies based on cluster analysis and LEM2, in terms of complexity (size of rule sets and the total number of conditions) and, more importantly, in terms of error rates.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Bajcar, S., Grzymala–Busse, J.W., Hippe, Z.S.: A comparison of six discretization algorithms used for prediction of melanoma. In: Proceedings of the Eleventh International Symposium on Intelligent Information Systems, IIS 2002, Sopot, Poland, June 3-6, pp. 3–12. Physica, Heidelberg (2003)
Booker, L.B., Goldberg, D.E., Holland, J.F.: Classifier systems and genetic algorithms. In: Carbonell, J.G. (ed.) Machine Learning. Paradigms and Methods, pp. 235–282. MIT Press, Cambridge (1990)
Chmielewski, M.R., Grzymala-Busse, J.W.: Global discretization of continuous attributes as preprocessing for machine learning. Int. Journal of Approximate Reasoning 15, 319–331 (1990)
Everitt, B.: Cluster Analysis, 2nd edn. Heinmann Educational Books, London (1980)
Grzymala-Busse, J.W.: LERS—A system for learning from examples based on rough sets. In: Slowinski, R. (ed.) Intelligent Decision Support. Handbook of Applications and Advances of the Rough Sets Theory, pp. 3–18. Kluwer Academic Publishers, Dordrecht (1992)
Grzymala-Busse, J.W.: A new version of the rule induction system LERS. Fundamenta Informatica 31, 27–39 (1997)
Grzymala–Busse, J.W.: Discretization of numerical attributes. In: Klösgen, W., Zytkow, J. (eds.) Handbook of Data Mining and Knowledge Discovery, pp. 218–225. Oxford Univ. Press, Oxford (2002)
Grzymala–Busse, J.W.: MLEM2: A new algorithm for rule induction from imperfect data. In: Proceedings of the 9th International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems, IPMU 2002, Annecy, France, July 1-5, pp. 243–250 (2002)
Grzymala–Busse, J.W., Stefanowski, J.: Discretization of numerical attributes by direct use of the rule induction algorithm LEM2 with interval extension. In: Proceedings of the Sixth Symposium on Intelligent Information Systems, IIS 1997, Zakopane, Poland, June 9-13, pp. 149–158 (1997)
Grzymala–Busse, J.W., Stefanowski, J.: Three discretization methods for rule induction. International Journal of Intelligent Systems 16, 29–38 (2001)
Hamburg, M.: Statistical Analysis for Decision Making. Harcourt Brace Jovanovich 721, 546–550 (1983)
Holland, J.H., Holyoak, K.J., Nisbett, R.E.: Induction. In: Processes of Inference, Learning, and Discovery, MIT Press, Cambridge (1983)
Michalski, R.S.: A Theory and Methodology of Inductive Learning. In: Michalski, R.S., Carbonell, J.G., Mitchell, T.M. (eds.) Machine Learning. An Artificial Intelligence Approach, pp. 83–134. Morgan Kauffman, San Francisco (1983)
Pawlak, Z.: Rough Sets. International Journal of Computer and Information Sciences 11, 341–356 (1982)
Pawlak, Z.: Rough Sets: Theoretical Aspects of Reasoning about Data. Kluwer, Dordrecht (1991)
Peterson, N.: Discretization using divisive cluster analysis and selected post- processing techniques. Internal Report, Department of Computer Science, University of Kansas (1993)
Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers, San Francisco (1993)
Stefanowski, J.: On rough set based approaches to induction of decision rules. In: Polkowski, L., Skowron, A. (eds.) Rough Sets in Data Mining and Knowledge Discovery, pp. 500–529. Physica, Heidelberg (1998)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Grzymala-Busse, J.W. (2004). Three Strategies to Rule Induction from Data with Numerical Attributes. In: Peters, J.F., Skowron, A., Dubois, D., Grzymała-Busse, J.W., Inuiguchi, M., Polkowski, L. (eds) Transactions on Rough Sets II. Lecture Notes in Computer Science, vol 3135. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-27778-1_4
Download citation
DOI: https://doi.org/10.1007/978-3-540-27778-1_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23990-1
Online ISBN: 978-3-540-27778-1
eBook Packages: Computer ScienceComputer Science (R0)