Science China Information Sciences

, Volume 53, Issue 6, pp 1188–1200 | Cite as

Information entropy for ordinal classification

Research Papers

Abstract

Ordinal classification plays an important role in various decision making tasks. However, little attention is paid to this type of learning tasks compared with general classification learning. Shannon information entropy and the derived measure of mutual information play a fundamental role in a number of learning algorithms including feature evaluation, selection and decision tree construction. These measures are not applicable to ordinal classification for they cannot characterize the consistency of monotonicity in ordinal classification. In this paper, we generalize Shannon’s entropy to crisp ordinal classification and fuzzy ordinal classification, and show the information measures of ranking mutual information and fuzzy ranking mutual information. We discuss the properties of these measures and show that the proposed ranking mutual information and fuzzy ranking mutual information are the indexes of consistency of monotonicity in ordinal classification. In addition, the proposed indexes are used to evaluate the monotonicity degree between features and decision in the context of ordinal classification.

Keywords

ordinal classification information entropy ranking entropy ranking mutual information 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Kamishima T, Akaho S. Dimension reduction for supervised ordering. In: Proceedings of the Sixth International Conference on Data Mining (ICDM’06). Hong Kong, China, 2006. 18–22Google Scholar
  2. 2.
    Lee J W T, Yeung D S, Wang X. Monotonic decision tree for ordinal classification. IEEE Int Conf Syst Man Cybern, 2003, 3: 2623–2628Google Scholar
  3. 3.
    Ben-David A, Sterling L, Pao Y H. Learning and classification of monotonic ordinal concepts. Comput Intell, 1989, 5: 45–49CrossRefGoogle Scholar
  4. 4.
    Ben-David A. Automatic generation of symbolic multiattribute ordinal knowledge-based DSSs: Methodology and applications. Decis Sci, 1992, 23: 1357–1372CrossRefGoogle Scholar
  5. 5.
    Frank E, Hall M. A simple approach to ordinal classification. In: De Raedt L, Flach P, eds. ECML 2001, LNAI 2167. Berlin: Springer-Verlag, 2001. 145–156CrossRefGoogle Scholar
  6. 6.
    Costa J P, Cardoso J S. Classification of ordinal data using neural networks. In: Gama J, Camacho R, Brazdil P, et al. eds. ECML 2005, LNAI 3720. Berlin: Springer-Verlag, 2005. 690–697CrossRefGoogle Scholar
  7. 7.
    Cardoso J S, Costa J F P. Learning to classify ordinal data: the data replication method. J Mach Learn Res, 2007, 8: 1393–1429MathSciNetGoogle Scholar
  8. 8.
    Costa J P, Alonso H, Cardoso J S. The unimodal model for the classification of ordinal data. Neur Netw, 2008, 21: 78–91Google Scholar
  9. 9.
    Ben-David A. Monotonicity maintenance in information-theoretic machine learning algorithms. Mach Learn, 1995, 19: 29–43Google Scholar
  10. 10.
    Potharst R, Bioch J C. Decision trees for ordinal classification. Intell Data Anal, 2000, 4: 97–111MATHGoogle Scholar
  11. 11.
    Cao-Van K, Baets B D. Growing decision trees in an ordinal setting. Int J Intell Syst, 2003, 18: 733–750MATHCrossRefGoogle Scholar
  12. 12.
    Potharst R, Feelders A J. Classification trees for problems with monotonicity constraints. ACM SIGKDD Explor Newslett, 2002, 4: 1–10CrossRefGoogle Scholar
  13. 13.
    Xia F, Zhang W S, Li F X, et al. Ranking with decision tree. Know Inf Syst, 2008, 17: 381–395CrossRefGoogle Scholar
  14. 14.
    Greco S, Matarazzo B, Slowinski R. Rough approximation of a preference relation by dominance relations. ICS Research Report 16/96. Europ J Operat Res, 1999, 117: 63–83MATHCrossRefGoogle Scholar
  15. 15.
    Hu Q, Yu D, Guo M Z. Fuzzy preference based rough sets. Inf Sci, 2010, 180: 2003–2022CrossRefGoogle Scholar
  16. 16.
    Lee J W T, Yeung D S, Tsang E C C. Rough sets and ordinal reducts. Soft Comput, 2006, 10: 27–33CrossRefGoogle Scholar
  17. 17.
    Sai Y, Yao Y Y, Zhong N. Data analysis and mining in ordered information tables. In: Proceedings of the IEEE International Conference on Data Mining, IEEE Computer Society, 2001. 497–504Google Scholar
  18. 18.
    Greco S, Matarazzo B, Slowinski R. Rough sets methodology for sorting problems in presence of multiple attributes and criteria. Europ J Operat Res, 2002, 138: 247–259MATHCrossRefMathSciNetGoogle Scholar
  19. 19.
    Liang J Y, Qian Y H. Information granules and entropy theory in information systems. Sci China Ser F-Inf Sci, 2008, 51: 1427–1444MATHCrossRefMathSciNetGoogle Scholar
  20. 20.
    Hu D, Li H X, Yu X C. The information content of rules and rule sets and its application. Sci China Ser F-Inf Sci, 2008, 51: 1958–1979CrossRefMathSciNetGoogle Scholar
  21. 21.
    Mingers J. An empirical comparison of selection measures for decision-tree induction. Mach Learn, 1989, 3: 319–342Google Scholar
  22. 22.
    Peng H, Long F, Ding C. Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Patt Anal Mach Intell, 2005, 27: 1226–1238CrossRefGoogle Scholar
  23. 23.
    Fayyad U M, Irani K B. On the handling of continuous-valued attributes in decision tree generation. Mach Learn, 1992, 8: 87–102MATHGoogle Scholar
  24. 24.
    Viola P, Wells W M. III. Alignment by maximization of mutual information. Int J Comput Vision, 1997, 24: 137–154CrossRefGoogle Scholar
  25. 25.
    Spearman C. “Footrule” for measuring correlation. British J Psych, 1906, 2: 89–108Google Scholar
  26. 26.
    Hu Q H, Yu D R, Xie Z X, et al. Fuzzy probabilistic approximation spaces and their information measures. IEEE Trans Fuzzy Syst, 2006, 14: 191–201CrossRefGoogle Scholar
  27. 27.
    Yu D R, Hu Q H, Wu C. Uncertainty measures for fuzzy relations and their applications. Appl Soft Comput, 2007, 7: 1135–1143CrossRefGoogle Scholar
  28. 28.
    Quinlan J R. Induction of decision trees. Mach Learn 1986, 1: 81–106Google Scholar
  29. 29.
    Quinlan J R. C4.5: Programs for Machine Learning. San Mateo, CA: Morgan Kaufmann, 1993Google Scholar
  30. 30.
    Pawlak Z. Rough Sets, Theoretical Aspects of Reasoning About Data. Dordrecht: Kluwer Academic Publishers, 1991MATHGoogle Scholar
  31. 31.
    Greco S, Matarazzo B, Slowinski R. Rough approximation by dominance relations. Int J Intell Syst, 2002, 17: 153–171MATHCrossRefGoogle Scholar

Copyright information

© Science China Press and Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  1. 1.Harbin Institute of TechnologyHarbinChina

Personalised recommendations