Skip to main content

Gene Feature Extraction Using T-Test Statistics and Kernel Partial Least Squares

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4234))

Abstract

In this paper, we propose a gene extraction method by using two standard feature extraction methods, namely the T-test method and kernel partial least squares (KPLS), in tandem. First, a preprocessing step based on the T-test method is used to filter irrelevant and noisy genes. KPLS is then used to extract features with high information content. Finally, the extracted features are fed into a classifier. Experiments are performed on three benchmark datasets: breast cancer, ALL/AML leukemia and colon cancer. While using either the T-test method or KPLS does not yield satisfactory results, experimental results demonstrate that using these two together can significantly boost classification accuracy, and this simple combination can obtain state-of-the-art performance on all three datasets.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   139.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Alon, U., Barkai, N., Notterman, D.A., Gish, K., Ybarra, S., Mack, D., Levine, A.J.: Broad patterns of gene expression revealed by clustering of tumor and normal colon tissues probed by oligonucleotide arrays. Proceedings of the National Academy of Science 96, 6745–6750 (1999)

    Article  Google Scholar 

  2. Ben-Dor, A., Bruhn, L., Friedman, N., Nachman, I., Schummer, M., Yakhini, Z.: Tissue classification with gene expression profiles. In: Proceedings of the Fourth Annual International Conference on Computational Molecular Biology, pp. 54–64 (2000)

    Google Scholar 

  3. Chai, H., Domeniconi, C.: An evaluation of gene selection methods for multi-class microarray data classification. In: Proceedings of the Second European Workshop on Data Mining and Text Mining for Bioinformatics, Pisa, Italy, September 2004, pp. 3–10 (2004)

    Google Scholar 

  4. Duan, K., Rajapakse, J.C.: A variant of SVM-RFE for gene selection in cancer classification with expression data. In: Proceedings of the IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology, pp. 49–55 (2004)

    Google Scholar 

  5. Golub, T.R., Slonim, D.K., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, J.P., Coller, H., Loh, M.L., Downing, J.R., Caligiuri, M.A., Bloomfield, C.D., Lander, E.S.: Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring. Science 286(5439), 531–537 (1999)

    Article  Google Scholar 

  6. Krishnapuram, B., Carin, L., Hartemink, A.: Gene expression analysis: Joint feature selection and classifier design. In: Schölkopf, B., Tsuda, K., Vert, J.-P. (eds.) Kernel Methods in Computational Biology, pp. 299–318. MIT, Cambridge (2004)

    Google Scholar 

  7. Liu, H., Li, J., Wong, L.: A comparative study on feature selection and classification methods using gene expression profiles and proteomic patterns. Genome Informatics 13, 51–60 (2002)

    Google Scholar 

  8. Ni, B., Liu, J.: A hybrid filter/wrapper gene selection method for microarray classification. In: Proceedings of International Conference on Machine Learning and Cybernetics, pp. 2537–2542 (2004)

    Google Scholar 

  9. Rosipal, R.: Kernel partial least squares for nonlinear regression and discrimination. Neural Network World 13(3), 291–300 (2003)

    Google Scholar 

  10. Rosipal, R., Trejo, L.J., Matthews, B.: Kernel PLS-SVC for linear and nonlinear classification. In: Proceedings of the Twentieth International Conference on Machine Learning, Washington, D.C., USA, August 2003, pp. 640–647 (2003)

    Google Scholar 

  11. Tang, Y., Zhang, Y.-Q., Huang, Z.: FCM-SVM-RFE gene feature selection algorithm for leukemia classification from microarray gene expression data. In: Proceedings of IEEE International Conference on Fuzzy Systems, pp. 97–101 (2005)

    Google Scholar 

  12. West, M., Blanchette, C., Dressman, H., Huang, E., Ishida, S., Spang, R., Zuzan, H., Olson Jr., J.A., Marks, J.R., Nevins, J.R.: Predicting the clinical status of human breast cancer by using gene expression profiles. Proceedings of the National Academy of Science 98(20), 11462–11467 (2001)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Li, S., Liao, C., Kwok, J.T. (2006). Gene Feature Extraction Using T-Test Statistics and Kernel Partial Least Squares. In: King, I., Wang, J., Chan, LW., Wang, D. (eds) Neural Information Processing. ICONIP 2006. Lecture Notes in Computer Science, vol 4234. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11893295_2

Download citation

  • DOI: https://doi.org/10.1007/11893295_2

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-46484-6

  • Online ISBN: 978-3-540-46485-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics