Advertisement

Comparison of Information Theoretical Measures for Reduct Finding

  • Szymon Jaroszewicz
  • Marcin Korzeń
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4029)

Abstract

The paper discusses the properties of an attribute selection criterion for building rough set reducts based on discernibility matrix and compares it with Shannon entropy and Gini index used for building decision trees. It has been shown theoretically and experimentally that entropy and Gini index tend to work better if the reduct is later used for prediction of previously unseen cases, and the criterion based on the discernibility matrix tends to work better for learning functional relationships where generalization is not an issue.

Keywords

Shannon Entropy Gini Index Independent Attribute Validation Error Discernibility Matrix 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Bazan, J.: Methods of approximating infrence for synthesis of decision algorithms. PhD thesis, Warsaw University (in Polish) (1999)Google Scholar
  2. 2.
    Bazan, J., Nguyen, H.S., Nguyen, S.H., Synak, P., Wróblewski, J.: Rough set algorithms in classification problems. In: Rough Set Methods and Applications: New Developments in Knowledge Discovery in Information Systems. Studies in Fuzziness and Soft Computing, vol. 56, pp. 49–88. Physica-Verlag (2000)Google Scholar
  3. 3.
    Breiman, L., Olshen, R.A., Friedman, J.H., Stone, C.J.: Classification and Regression Trees. CRC Press, Boca Raton (1984)MATHGoogle Scholar
  4. 4.
    Grzymala-Busse, J.: LERS—a system for learning from examples based on rough sets. In: Intelligent Decision Support. Handbook of Applications and Advances of the Rough Sets Theory, pp. 3–18. Kluwer Academic Publishers, Dordrecht (1992)Google Scholar
  5. 5.
    Grzymala-Busse, J.: LERS - a data mining system. In: The Data Mining and Knowledge Discovery Handbook, pp. 1347–1351. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  6. 6.
    Korzeń, M., Jaroszewicz, S.: Finding reducts without building the discernibility matrix. In: Proc. of the 5th Int. Conf. on Intelligent Systems Design and Applications (ISDA 2005), pp. 450–455 (2005)Google Scholar
  7. 7.
    Nguyen, S.H., Nguyen, H.S.: Some efficient algorithms for rough set methods. In: Proceedings of the Conference of Information Processing and Management of Uncertainty in Knowledge-Based Systems IPMU 1996, Granada, Spain, July 1996, pp. 1451–1456 (1996)Google Scholar
  8. 8.
    Pawlak, Z.: Rough sets: Theoretical aspects of reasoning about data. Kluwer Academic Publishers, Dordrecht (1991)MATHGoogle Scholar
  9. 9.
    Quinlan, R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Francisco (1993)Google Scholar
  10. 10.
    Simovici, D., Jaroszewicz, S.: A new metric splitting criterion for decision trees. Journal of Parallel, Emerging and Distributed Computing (to appear)Google Scholar
  11. 11.
    Stefanowski, J.: On rough set based approaches to induction of decision rules. In: Skowron, A., Polkowski, L. (eds.) Rough Sets in Knowledge Discovery, vol. 1, pp. 500–529. Physica Verlag, Heidelberg (1998)Google Scholar
  12. 12.
    Zhang, J., Wang, J., Li, D., He, H., Sun, J.: A new heuristic reduct alghorithm base on rough sets theory. In: Dong, G., Tang, C.-j., Wang, W. (eds.) WAIM 2003. LNCS, vol. 2762, pp. 247–253. Springer, Heidelberg (2003)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Szymon Jaroszewicz
    • 1
  • Marcin Korzeń
    • 1
  1. 1.Faculty of Computer Science and Information SystemsTechnical University of SzczecinSzczecinPoland

Personalised recommendations