Skip to main content

Performance Evaluation of Ranking Methods for Relevant Gene Selection in Cancer Microarray Datasets

  • Conference paper
  • 2232 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7629))

Abstract

Microarray data is often characterized by high dimension and small sample size. Gene ranking is one of the most widely explored techniques to reduce the dimension because of its simplicity and computational efficiency. Many ranking methods have been suggested which depict their efficiency dependent upon the problem at hand. We have investigated the performance of six ranking methods on eleven cancer microarray datasets. The performance is evaluated in terms of classification accuracy and number of genes. Experimental results on all dataset show that there is significant variation in classification accuracy which depends on the choice of ranking method and classifier. Empirical results show that Brown Forsythe test statistics and Mutual Information method exhibit high accuracy with few genes whereas Gini Index and Pearson Coefficient perform poorly in most cases.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Alon, U., Barkai, N., Notterman, D.A., Gish, K., Ybarra, S., Mack, D., Levine, A.J.: Broad Patterns of Gene Expression Revealed by Clustering Analysis of Tumor and Normal Colon Tissues Probed by Oligonucleotide Array. Proc. Nat’l Academy of Science 96(12), 6745–6750 (1999)

    Article  Google Scholar 

  2. Bellman, R.: Adaptive Control Processes. In: A Guided Tour, Princeton University Press, Princeton (1961)

    Google Scholar 

  3. Bittner, M., Meltzer, P., Chen, Y., Jiang, Y., Seftor, E., Hendrix, M., Radmacher, M., Simon, R., Yakhini, Z., et al.: Molecular classification of cutaneous malignant melanoma by gene expression profiling. Nature 406(6795), 536–540 (2000)

    Article  Google Scholar 

  4. Breiman, L., Friedman, J., Olshen, R., Stone, C.: Classification and regression trees. Chapman and Hall, Boca Raton (1984)

    MATH  Google Scholar 

  5. Brown, M.B., Forsythe, A.B.: The small sample behavior of some statistics which test the equality of several means. Technometrics 16, 129–132 (1974)

    Article  MathSciNet  MATH  Google Scholar 

  6. Cochran, W.G.: Problems arising in the analysis of a series of similar experiments. J. R. Stat. Soc. Ser. C Appl. Stat. 4, 102–118 (1937)

    Google Scholar 

  7. Dechang, C., Zhenqiu, L., Xiaobin, M., Dong, H.: Selecting Genes by Test Statistics. Journal of Biomedicine and Biotechnology 2, 132–138 (2005)

    Google Scholar 

  8. Demsar, J.: Statistical Comparisons of Classifiers over Multiple Data Sets. Journal of Machine Learning Research 7, 1–30 (2006)

    MathSciNet  MATH  Google Scholar 

  9. Dowdy, S., Wearden, S.: Statistics for research. Wiley (1983)

    Google Scholar 

  10. Friedman, M.: A comparison of alternative tests of significance for the problem of m rankings. Annals of Mathematical Statistics 11, 86–92 (1940)

    Article  MathSciNet  Google Scholar 

  11. Fu, L.M., Liu, C.S.F.: Evaluation of gene importance in microarray data based upon probability of selection. BMC Bioinformatics 6, 67 (2005)

    Article  Google Scholar 

  12. Golub, T.R., Slonim, D.K., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, J.P., Coller, H., Loh, M.L., Downing, J.R., Caligiuri, M.A., Bloomfield, C.D., Lander, E.S.: Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286(5439), 531–537 (1999)

    Article  Google Scholar 

  13. Guyon, I., Elisseff, A.: An Introduction to variable and feature selection. Journal of Machine Learning Research 3, 1157–1182 (2003)

    MATH  Google Scholar 

  14. Hartung, J., Argac, D., Makambi, K.: Small sample properties of tests on homogeneity in oneway ANOVA and meta-analysis. Statist Papers 43, 197–235 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  15. Khan, J., Wei, J.S., Ringner, M., Saal, L.H., Ladanyi, M., Westermann, F., Berthold, F., Schwab, M., Antonescu, C.R., Peterson, C., Meltzer, P.S.: Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks. Nat. Med. 7(6), 673–679 (2001)

    Article  Google Scholar 

  16. Kohavi, R., John, G.: Wrapper for feature subset selection. Artificial Intelligence 97(1-2), 273–324 (1997)

    Article  MATH  Google Scholar 

  17. Li, T., Zhang, C., Ogihara, M.: Comparative study of feature selection and multiclass classification methods for tissue classification based on gene expression. Bioinformatics 20, 2429–2437 (2004)

    Article  Google Scholar 

  18. Neter, J., Kutner, M.H., Nachtsheim, C.J., et al.: Applied Linear Statistical Models, 4th edn. McGraw-Hill, Chicago (1996)

    Google Scholar 

  19. Nutt, C.L., Mani, D.R., Betensky, R.A., Tamayo, P., Cairncross, J.G., Ladd, C., Pohl, U., Hartmann, C., McLaughlin, M.E., Batchelor, T.T., Black, P.M., von Deimling, A., Pomeroy, S.L., Golub, T.R., Louis, D.N.: Gene expressionbased classification of malignant gliomas correlates better with survival than histological classification. Cancer Res. 63(7), 1602–1607 (2003)

    Google Scholar 

  20. Pearson, K.: Notes on the History of Correlation. Biometrika 13(1), 25–45 (1920)

    Article  Google Scholar 

  21. Pomeroy, S.L., Tamayo, P., Gaasenbeek, M., Sturla, L.M., Angelo, M., McLaughlin, M.E., Kim, J.Y.H., et al.: Prediction of central nervous system embryonal tumour outcome based on gene expression. Nature 415(6870), 436–442 (2002)

    Article  Google Scholar 

  22. Ramaswamy, S., Tamayo, P., Rifkin, R., Mukherjee, S., Yeang, C.H., Angelo, M., Ladd, C., Reich, M., Latulippe, E., Mesirov, J.P., Poggio, T., Gerald, W., Loda, M., Lander, E.S., Golub, T.R.: Multiclass cancer diagnosis using tumor gene expression signatures. Proc. Natl. Acad. Sci. USA 98(26), 15149–15154 (2001)

    Article  Google Scholar 

  23. Ross, D.T., Scherf, U., Eisen, M.B., Perou, C.M., Rees, C., Spellman, P., Iyer, V., Jeffrey, S.S., Van De Rijn, M., Walthamet, M., et al.: Systematic Variation in Gene Expression Patterns in Human Cancer Cell Lines. Nature Genet. 24, 227–235 (2000)

    Article  Google Scholar 

  24. Shah, S., Kusiak, A.: Cancer gene search with data mining and genetic algorithms. Computers in Biology Medicine 37(2), 251–261 (2007)

    Article  Google Scholar 

  25. Shannon, C.E., Weaver, W.: The mathematical theory of Communication. University of Illinois Press, Urbana (1949)

    MATH  Google Scholar 

  26. Singh, D., Febbo, P.G., Ross, K., Jackson, D.G., Manola, J., Ladd, C., Tamayo, P., Renshaw, A.A., et al.: Gene expression correlates of clinical prostate cancer behavior. Cancer Cell 1(2), 203–209 (2002)

    Article  Google Scholar 

  27. Su, A.I., Welsh, J.B., Sapinoso, L.M., Kern, S.G., Dimitrov, P., Lapp, H., Schultz, P.G., Powell, S.M., Moskaluk, C.A., Frierson, H.F., Hampton, G.M.: Molecular classification of human carcinomas by use of gene expression signatures. Cancer Res. 61(20), 7388–7393 (2001)

    Google Scholar 

  28. Su, Y., Murali, T.M., et al.: RankGene: identification of diagnostic genes based on expression data. Bionformatics 19(12), 1578–1579 (2003)

    Article  Google Scholar 

  29. Welch, B.L.: On the comparison of several mean values: An alternative approach. Biometrika 38, 330–336 (1951)

    MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Sardana, M., Kaur, B., Agrawal, R.K. (2013). Performance Evaluation of Ranking Methods for Relevant Gene Selection in Cancer Microarray Datasets. In: Batyrshin, I., González Mendoza, M. (eds) Advances in Artificial Intelligence. MICAI 2012. Lecture Notes in Computer Science(), vol 7629. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-37807-2_35

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-37807-2_35

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-37806-5

  • Online ISBN: 978-3-642-37807-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics