Information entropy for ordinal classification
Ordinal classification plays an important role in various decision making tasks. However, little attention is paid to this type of learning tasks compared with general classification learning. Shannon information entropy and the derived measure of mutual information play a fundamental role in a number of learning algorithms including feature evaluation, selection and decision tree construction. These measures are not applicable to ordinal classification for they cannot characterize the consistency of monotonicity in ordinal classification. In this paper, we generalize Shannon’s entropy to crisp ordinal classification and fuzzy ordinal classification, and show the information measures of ranking mutual information and fuzzy ranking mutual information. We discuss the properties of these measures and show that the proposed ranking mutual information and fuzzy ranking mutual information are the indexes of consistency of monotonicity in ordinal classification. In addition, the proposed indexes are used to evaluate the monotonicity degree between features and decision in the context of ordinal classification.
Keywordsordinal classification information entropy ranking entropy ranking mutual information
Unable to display preview. Download preview PDF.
- 1.Kamishima T, Akaho S. Dimension reduction for supervised ordering. In: Proceedings of the Sixth International Conference on Data Mining (ICDM’06). Hong Kong, China, 2006. 18–22Google Scholar
- 2.Lee J W T, Yeung D S, Wang X. Monotonic decision tree for ordinal classification. IEEE Int Conf Syst Man Cybern, 2003, 3: 2623–2628Google Scholar
- 8.Costa J P, Alonso H, Cardoso J S. The unimodal model for the classification of ordinal data. Neur Netw, 2008, 21: 78–91Google Scholar
- 9.Ben-David A. Monotonicity maintenance in information-theoretic machine learning algorithms. Mach Learn, 1995, 19: 29–43Google Scholar
- 17.Sai Y, Yao Y Y, Zhong N. Data analysis and mining in ordered information tables. In: Proceedings of the IEEE International Conference on Data Mining, IEEE Computer Society, 2001. 497–504Google Scholar
- 21.Mingers J. An empirical comparison of selection measures for decision-tree induction. Mach Learn, 1989, 3: 319–342Google Scholar
- 25.Spearman C. “Footrule” for measuring correlation. British J Psych, 1906, 2: 89–108Google Scholar
- 28.Quinlan J R. Induction of decision trees. Mach Learn 1986, 1: 81–106Google Scholar
- 29.Quinlan J R. C4.5: Programs for Machine Learning. San Mateo, CA: Morgan Kaufmann, 1993Google Scholar