Skip to main content

Invited Keynote Talk: Data Mining and Statistical Methods for Analyzing Microarray Experiments

  • Conference paper
Bioinformatics Research and Applications (ISBRA 2008)

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 4983))

Included in the following conference series:

  • 936 Accesses

Abstract

Deoxyribonucleic acid (DNA) microarrays are part of a promising class of biotechnologies that allow the simultaneous monitoring of expression levels in cells for thousands of genes. One of important issues in microarray experiments is the classification of biological samples and predicting clinical or other outcomes using gene expression data. A closely related issue is the identification of marker genes that have good predictive power for an outcome of interest. Although classification is not a new subject in the statistical literature, the large number of genes with relatively small sample size generated by microarray experiments raises new computational challenges. In this study, the gene expressions of breast cancer tumors are investigated and the performance of several popular classification methods, including decision tree, logistic regression, linear discriminant analysis, and k-nearest neighbor are compared. The results show that certain genes are significantly differentially expressed across groups of patients, and k-nearest neighbor method achieves better performance in class prediction than the other classification methods.

In addition to reviewing and illustrating the implementation of standard statistical tests and classification methods in modeling genome data, we will also address some important issues in the study, such as the role of experimental design (e.g., split-plot experimental design and analysis), the impact of correlation (within plate, between plates, between probe, etc.), the sampling issue in cross validation and training-testing splitting.  While these issues have been discussed in simple statistical problems, they have not been well understood by bioinformatics researchers in modeling complex microarray data.  In this talk, we will address these issues and their impact on various standard testing and classification methods and illustrate the potential problems through the cancer tumor microarray experiments.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Author information

Authors and Affiliations

Authors

Editor information

Ion Măndoiu Raj Sunderraman Alexander Zelikovsky

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Lo, SL., Tsui, KL., Barwick, B. (2008). Invited Keynote Talk: Data Mining and Statistical Methods for Analyzing Microarray Experiments. In: Măndoiu, I., Sunderraman, R., Zelikovsky, A. (eds) Bioinformatics Research and Applications. ISBRA 2008. Lecture Notes in Computer Science(), vol 4983. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-79450-9_41

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-79450-9_41

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-79449-3

  • Online ISBN: 978-3-540-79450-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics