Summary
This paper reports a comparative study demonstrating what level of predictive performance can be achieved if class prediction is attempted based on features obtained as the top most differently expressed genes from class comparison studies. Several typically used methods of gene ranking in class comparison are considered including Wilcoxon rank test, signal to noise and fold-change method. Predictive performance is estimated for a variety of feature set dimensionalities, this allows to empirically find a classification model yielding best performance for new data. This is used as a measure of predictive performance of feature vectors. Predictive performance is illustrated using publicly available microarray data sets. Results are compared with those using feature selection methods aiming to reduce feature redundancy.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Alon, U., et al.: Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proc.Natl Acad. Sci. 96, 6745–6750 (1999)
Golub, T., et al.: Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring. Science 286, 531–537 (1999)
Gordon, G.J., et al.: Translation of Microarray Data into Clinically Relevant Cancer Diagnostic Tests Using Gene Expression Ratios in Lung Cancer and Mesothelioma. Cancer Research 62, 4963–4967 (2002)
Guo, L., et al.: Rat toxicogenomic study reveals analytical consistency across microarray platforms. Nature Biotechnology 24, 1162–1169 (2006)
Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning. In: Data Mining, Inference and Prediction, Springer, Heidelberg (2002)
Maciejewski, H.: Adaptive selection of feature set dimensionality for classification of DNA microarray samples. In: Computer recognition systems CORES 2007. Springer Advances in Soft Computing (2007)
Maciejewski, H.: Quality of feature selection based on microarray gene expression data. ICCS 2008 (submitted, 2008)
Shi, L., et al.: MAQC Consortium. The MicroArray Quality Control (MAQC) project shows inter- and intraplatform reproducibility of gene expression measurements. Nature Biotechnology 24, 1151–1161 (2006)
Markowetz, F., Spang, R.: Molecular diagnosis. Classification, Model Selection and performance evaluation, Methods Inf. Med. 44, 438–443 (2005)
Polanski, A., Kimmel, M.: Bioinformatics. Springer, Heidelberg (2007)
Simon, R., et al.: Pitfalls in the Use of DNA Microarray Data for Diagnostic and Prognostic Classification. Journal of the National Cancer Institute 95, 14–18 (2003)
Yu, L., Liu, H.: Redundancy based feature selection for microarray data. In: Proc. of the tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, New York (2004)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Maciejewski, H. (2008). Predictive Performance of Top Differentially Expressed Genes in Microarray Gene Expression Studies. In: Pietka, E., Kawa, J. (eds) Information Technologies in Biomedicine. Advances in Soft Computing, vol 47. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-68168-7_44
Download citation
DOI: https://doi.org/10.1007/978-3-540-68168-7_44
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-68167-0
Online ISBN: 978-3-540-68168-7
eBook Packages: EngineeringEngineering (R0)