Skip to main content

Gene Selection and Classification of Human Lymphoma from Microarray Data

  • Conference paper
Biological and Medical Data Analysis (ISBMDA 2005)

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 3745))

Included in the following conference series:

  • 1220 Accesses


Experiments in DNA microarray provide information of thousands of genes, and bioinformatics researchers have analyzed them with various machine learning techniques to diagnose diseases. Recently Support Vector Machines (SVM) have been demonstrated as an effective tool in analyzing microarray data. Previous work involving SVM used every gene in the microarray to classify normal and malignant lymphoid tissue. This paper shows that, using gene selection techniques that selected only 10% of the genes in “Lymphochip” (a DNA microarray developed at Stanford University School of Medicine), a classification accuracy of about 98% is achieved which is a comparable performance to using every gene. This paper thus demonstrates the usefulness of feature selection techniques in conjunction with SVM to improve its performance in analyzing Lymphochip microarray data. The improved performance was evident in terms of better accuracy, ROC (receiver operating characteristics) analysis and faster training. Using the subsets of Lymphochip, this paper then compared the performance of SVM against two other well-known classifiers: multi-layer perceptron (MLP) and linear discriminant analysis (LDA). Experimental results show that SVM outperforms the other two classifiers.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others


  1. Alizadeh, A.A., Eisen, M.B., et al.: Distinct types of diffuse large B-cell lyumphoma identified by gene expression profiling. Nature 403, 503–511 (2000)

    Article  Google Scholar 

  2. Alon, U., Barkai, N., Notterman, D.A., et al.: Broad patterns of gene expressions revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. In: PNAS, vol. 96, pp. 6745–6750. National Academy of Sciences, Washington (1999)

    Google Scholar 

  3. Ben-Dor, A., Bruhn, L., Friedman, N., Nachman, I., Schummer, M., Yakhini, Z.: Tissue classification with gene expression profiles. In: 4th Intl Conf on Comptnl Molecular Bio, Universal Acad. Press, Tokyo (2000)

    Google Scholar 

  4. Bishop, C.M.: Neural Networks for Pattern Recognition. Clarendon Press, Oxford (1995)

    Google Scholar 

  5. Brown, M.P.S., Grundy, W.N., Lin, D., Cristianini, N., Sugnet, C., Agnes Jr. M., Haussler, D.: Support vector machine classification of microarray gene expression data. Technical report, U. California, Santa Cruz (1999)

    Google Scholar 

  6. Caruana, R.A., Freitag, D.: How useful is relevance? Technical report, Fall 1994 AAAI Symposium on Relevance, New Orleans (1994)

    Google Scholar 

  7. Chang, C. C., Lin, C. J.: LIBSVM: a library for support vector machines (2001),

  8. Chercassky, V., Mullier, P.: Learning from Data, Concepts, Theory and Methods. John Wiley, Chichester (1998)

    Google Scholar 

  9. Devore, J.L.: Probability and Statistics for Engineering and the Sciences. Brooks/Cole, Monterey (1987)

    MATH  Google Scholar 

  10. Dudoid, S., fridlyand, J., Speed, T.: Comparison of discrimination methods for the classification of tumors using gene expression data. Technical report, University of California, Berkeley (2000)

    Google Scholar 

  11. Lukas, L., et al.: Brain tumor classification based on long echo proton mrs signals. Artificial Intelligence in Medicine 31, 73–89 (2004)

    Article  Google Scholar 

  12. Golub, T.R., Slonim, D.K., Tamayo, P., et al.: Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring. Science 286, 531–537 (1998)

    Article  Google Scholar 

  13. Haykin, S.: Neural Network - A Comprehensive Foundation. Prentice Hall, Englewood Cliffs (1999)

    Google Scholar 

  14. Khan, J., Wei, J.S., Ringnér, M., Sall, L.H., Ladanyi, M., Westermann, F.: Classification and diagnostic prediction of cancers using gene expression profiling and aritifical neural networks. Nat. Med. 7(6), 673–679 (2001)

    Article  Google Scholar 

  15. Molina, L.C., Belanche, L., Nebot, A.: Feature selection algorithms: A survey and experimental evaluation. In: ICDM 2002 (2002)

    Google Scholar 

  16. Demuth, H.B., Hagan, M.T., Beale, M.H.: Neural Network Design. PWS Publishing, Boston (1996)

    Google Scholar 

  17. De Risi, J., Iyer, V., Brown, P.: Exploring the metabolic and genetic control of gene expression on a genomic scale. Science 278, 666–680 (1997)

    Article  Google Scholar 

  18. Rumelhart, D.E.: Parallel Distributed Processing and the PDP Research Group. MIT Press, New York (1986)

    Google Scholar 

  19. Simon, R., Lam, A.P.: BRB ArrayTools v 3.2 (2004),

  20. Tusher, V.G., Tibshirani, R., Chu, G.: Significance analysis of microarrays applied to the ionizing radiation response. Proc Natl. Acad Sci. 98, 5116–5121 (2001)

    Article  MATH  Google Scholar 

  21. Valentini, G.: Gene expression data analysis of human lymphoma using support vector machines and output coding ensembles. Artificial Intelligence in Medicine 26, 281–304 (2002)

    Article  Google Scholar 

  22. Vapnik, V.N.: The nature of statistical learning theory. Springer, New York (1995)

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations


Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Kamruzzaman, J., Lim, S., Gondal, I., Begg, R. (2005). Gene Selection and Classification of Human Lymphoma from Microarray Data. In: Oliveira, J.L., Maojo, V., Martín-Sánchez, F., Pereira, A.S. (eds) Biological and Medical Data Analysis. ISBMDA 2005. Lecture Notes in Computer Science(), vol 3745. Springer, Berlin, Heidelberg.

Download citation

  • DOI:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-29674-4

  • Online ISBN: 978-3-540-31658-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics