A New Maximum-Relevance Criterion for Significant Gene Selection

  • Young Bun Kim
  • Jean Gao
  • Pawel Michalak
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4146)


Gene (feature) selection has been an active research area in microarray analysis. Max-Relevance is one of the criteria which has been broadly used to find features largely correlated to the target class. However, most approximation methods for Max-Relevance do not consider joint effect of features on the target class. We propose a new Max-Relevance criterion which combines the collective impact of the most expressive features in Emerging Patterns (EPs) and some popular independent criteria such as t-test and symmetrical uncertainty. The main benefit of this criterion is that by capturing the joint effect of features using EPs algorithm, it finds the most discriminative features in a broader scope. Experiment results clearly demonstrate that our feature sets improve the class prediction comparing to other feature selections.


Support Vector Machine Feature Selection Gene Selection Feature Subset Target Class 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Kohavi, R., John, G.: Wrapper for feature subset selection. Arti. Intel. 97(1-2), 273–324 (1997)zbMATHCrossRefGoogle Scholar
  2. 2.
    Das, S.: Filters: Wrappers and a Boosting-Based Hybrid for Feature Selection. In: Proc. 18th Intl. Conf. Mach. Learn., pp. 74–81 (2001)Google Scholar
  3. 3.
    Liu, H., Yu, L.: Toward Integrating Feature Selection Algorithms for Classification and Clustering. IEEE Tran. on Know. and Data engi. 17(4), 491–502 (2005)CrossRefGoogle Scholar
  4. 4.
    Chung, H.Y., Liu, H., Brown, S., McMunn-Coffran, C., Kao, C.Y., frank Hsu, D.: Identifying Significant Genes from Microarray Data. In: Proc. of the fourth IEEE symp. on BIBE, vol. 358 (2004)Google Scholar
  5. 5.
    Ding, C., Peng, H.: Minimum redundancy feature selection from microarray gene expression data. J. of Bioinfo. and Comp. Bio. 3(2), 185–205 (2005)CrossRefMathSciNetGoogle Scholar
  6. 6.
    Yu, L., Liu, H.: Efficient Feature Selection via Analysis of Relevance and Redundancy. J. of Mach. Learn. Rese. 5, 1205–1224 (2004)MathSciNetGoogle Scholar
  7. 7.
    Peng, H., Long, F., Ding, C.: Feature Selection Based on Mutual Information: Criteria of Max-Dependency, Max-Relevance, and Min-Redundancy. IEEE Trans. on Patt. anal. and mach. intel. 27(8), 1226–1238 (2005)CrossRefGoogle Scholar
  8. 8.
    Dong, G., Li, J.: Efficient Mining of Emerging Patterns: Discovering Trends and Differences. In: Proc. of the fifth ACM SIGKDD Inter. Conf. on Know. Disc. and Data min., pp. 43–52 (1999)Google Scholar
  9. 9.
    Li, J., Dong, G., Ramamohanarao, K.: Making Use of the Most Expressive Jumping Emerging Patterns for Classification. Know. and Info. Sys. 3(2), 131–145 (2001)CrossRefGoogle Scholar
  10. 10.
    Li, J., Wong, L.: Identifying good diagnostic genes or genes groups from gene expression data by using the concept of emerging patterns. Bioinformatics 18, 725–734, 1407-1408 (2002)CrossRefGoogle Scholar
  11. 11.
    Fayyad, U., Irani, K.: Multi-interval discretization of continuous-valued attributes for classification learning. In: Proc. of the 13th Inter. J. Conf. on Arti. Intel., pp. 1022–1029 (1993)Google Scholar
  12. 12.
    Mitra, P., Murthy, Pal, S.K.: Unsupervised Feature Selection Using Feature Similarity. IEEE Tran. of Patt. anal. and mach. intel. 24(2), 301–312 (2002)CrossRefGoogle Scholar
  13. 13.
    Yu, L., Liu, H.: Redundancy Based Feature Selection for Microarray Data. In: KDD 2004, pp. 22–25 (2004)Google Scholar
  14. 14.
    Hsu, C.W., Lin, C.J.: A comparison of methods for multi-class support vector machines. IEEE Trans. of Neural Networks 13, 415–425 (2002)CrossRefGoogle Scholar
  15. 15.
    Alon, U., Barkai, N., Notterman, D., et al.: Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proc. of the Nat. Acad. of Sciences 96(10), 6745–6750 (1999)CrossRefGoogle Scholar
  16. 16.
    Golub, T.R., Slonim, D.K., et al.: Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286, 531–537 (1999)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Young Bun Kim
    • 1
  • Jean Gao
    • 1
  • Pawel Michalak
    • 2
  1. 1.Department of Computer Science and Engineering 
  2. 2.Department of BiologyThe University of TexasArlingtonUSA

Personalised recommendations