A New Criterion of Mutual Information Using R-value
Mutual information has wide area of application including feature selection and classification. To calculate mutual information, statistical equation of information theory has been used. In this paper, we propose a new criterion for mutual information. It is based on R-value which captures overlapping areas among classes in variables (features). Overlapping area of classes reflects uncertainty of the variables; it corresponds to the meaning of entropy. We compare traditional mutual information and R-value on the context of feature selection. From the experiment we confirm that proposed method shows better performance than traditional mutual information.
KeywordsEntropy Mutual information Attribute interaction R-value Information theory Data mining
This work was supported by the National Research Foundation of Korea Grant funded by the Korean Government (NRF-2012S1A2A1A01028576).
- 2.Anastassiou D (2007) Computational analysis of the synergy among multiple interacting genes. Mol Sys Biol 3(83):1–8Google Scholar
- 3.Definition of feature selection, in 〈http://en.wikipedia.org/wiki/Feature_selection〉
- 4.Liu H, Yu L (2005) Toward integrating feature selection algorithms for classification and clustering. IEEE Trans Knowl Data Eng 17:491–502Google Scholar
- 6.Berrar DP, Dubitzky W, Granzow M (2009) A practical approach to microarray data analysis. Springer Publishing Company, IncorporatedGoogle Scholar
- 12.Largeron C, Moulin C, Géry M (2011) Entropy based feature selection for text categorization. ACM Symp Appl Comput doi: 10.1145/1982185.1982389
- 13.Can-Tao L (2009) Mutual information based on Renyi’s entropy feature selection. IEEE international conference on intelligent computing and intelligent systems, 2009. ICIS 2009, vol 1, pp 816–820Google Scholar
- 14.Jakulin A, Bratko I, Smrke D, Demsar J, Zupan B (2003) Attribute interactions in medical data analysis. AI in Medicine in Europe (AIME), pp 229–238Google Scholar
- 15.Lee J, Batnyam N, Oh S RFS: efficient feature selection method based on R-value. Comput Biol Med (In press)Google Scholar
- 16.R software http://www.r-project.org
- 17.UCI machine learning repository http://archive.ics.uci.edu/ml/