Skip to main content

Three Strategies to Rule Induction from Data with Numerical Attributes

  • Conference paper
Transactions on Rough Sets II

Part of the book series: Lecture Notes in Computer Science ((TRS,volume 3135))

Abstract

Rule induction from data with numerical attributes must be accompanied by discretization. Our main objective was to compare two discretization techniques, both based on cluster analysis, with a new rule induction algorithm called MLEM2, in which discretization is performed simultaneously with rule induction. The MLEM2 algorithm is an extension of the existing LEM2 rule induction algorithm, working correctly only for symbolic attributes and being a part of the LERS data mining system. For the two strategies, based on cluster analysis, rules were induced by the LEM2 algorithm. Our results show that MLEM2 outperformed both strategies based on cluster analysis and LEM2, in terms of complexity (size of rule sets and the total number of conditions) and, more importantly, in terms of error rates.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bajcar, S., Grzymala–Busse, J.W., Hippe, Z.S.: A comparison of six discretization algorithms used for prediction of melanoma. In: Proceedings of the Eleventh International Symposium on Intelligent Information Systems, IIS 2002, Sopot, Poland, June 3-6, pp. 3–12. Physica, Heidelberg (2003)

    Google Scholar 

  2. Booker, L.B., Goldberg, D.E., Holland, J.F.: Classifier systems and genetic algorithms. In: Carbonell, J.G. (ed.) Machine Learning. Paradigms and Methods, pp. 235–282. MIT Press, Cambridge (1990)

    Google Scholar 

  3. Chmielewski, M.R., Grzymala-Busse, J.W.: Global discretization of continuous attributes as preprocessing for machine learning. Int. Journal of Approximate Reasoning 15, 319–331 (1990)

    Article  Google Scholar 

  4. Everitt, B.: Cluster Analysis, 2nd edn. Heinmann Educational Books, London (1980)

    MATH  Google Scholar 

  5. Grzymala-Busse, J.W.: LERS—A system for learning from examples based on rough sets. In: Slowinski, R. (ed.) Intelligent Decision Support. Handbook of Applications and Advances of the Rough Sets Theory, pp. 3–18. Kluwer Academic Publishers, Dordrecht (1992)

    Google Scholar 

  6. Grzymala-Busse, J.W.: A new version of the rule induction system LERS. Fundamenta Informatica 31, 27–39 (1997)

    MATH  Google Scholar 

  7. Grzymala–Busse, J.W.: Discretization of numerical attributes. In: Klösgen, W., Zytkow, J. (eds.) Handbook of Data Mining and Knowledge Discovery, pp. 218–225. Oxford Univ. Press, Oxford (2002)

    Google Scholar 

  8. Grzymala–Busse, J.W.: MLEM2: A new algorithm for rule induction from imperfect data. In: Proceedings of the 9th International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems, IPMU 2002, Annecy, France, July 1-5, pp. 243–250 (2002)

    Google Scholar 

  9. Grzymala–Busse, J.W., Stefanowski, J.: Discretization of numerical attributes by direct use of the rule induction algorithm LEM2 with interval extension. In: Proceedings of the Sixth Symposium on Intelligent Information Systems, IIS 1997, Zakopane, Poland, June 9-13, pp. 149–158 (1997)

    Google Scholar 

  10. Grzymala–Busse, J.W., Stefanowski, J.: Three discretization methods for rule induction. International Journal of Intelligent Systems 16, 29–38 (2001)

    Article  MATH  Google Scholar 

  11. Hamburg, M.: Statistical Analysis for Decision Making. Harcourt Brace Jovanovich 721, 546–550 (1983)

    Google Scholar 

  12. Holland, J.H., Holyoak, K.J., Nisbett, R.E.: Induction. In: Processes of Inference, Learning, and Discovery, MIT Press, Cambridge (1983)

    Google Scholar 

  13. Michalski, R.S.: A Theory and Methodology of Inductive Learning. In: Michalski, R.S., Carbonell, J.G., Mitchell, T.M. (eds.) Machine Learning. An Artificial Intelligence Approach, pp. 83–134. Morgan Kauffman, San Francisco (1983)

    Google Scholar 

  14. Pawlak, Z.: Rough Sets. International Journal of Computer and Information Sciences 11, 341–356 (1982)

    Article  MATH  MathSciNet  Google Scholar 

  15. Pawlak, Z.: Rough Sets: Theoretical Aspects of Reasoning about Data. Kluwer, Dordrecht (1991)

    MATH  Google Scholar 

  16. Peterson, N.: Discretization using divisive cluster analysis and selected post- processing techniques. Internal Report, Department of Computer Science, University of Kansas (1993)

    Google Scholar 

  17. Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers, San Francisco (1993)

    Google Scholar 

  18. Stefanowski, J.: On rough set based approaches to induction of decision rules. In: Polkowski, L., Skowron, A. (eds.) Rough Sets in Data Mining and Knowledge Discovery, pp. 500–529. Physica, Heidelberg (1998)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Grzymala-Busse, J.W. (2004). Three Strategies to Rule Induction from Data with Numerical Attributes. In: Peters, J.F., Skowron, A., Dubois, D., Grzymała-Busse, J.W., Inuiguchi, M., Polkowski, L. (eds) Transactions on Rough Sets II. Lecture Notes in Computer Science, vol 3135. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-27778-1_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-27778-1_4

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-23990-1

  • Online ISBN: 978-3-540-27778-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics