Abstract
In this paper we present a novel system that utilizes molecular-level information for medical diagnosis. It accepts high dimensional vectors of gene expressions, quantified by means of microarray image analysis, as input. The proposed system incorporates various data pre-processing methods, such as missing values estimation and data normalization. A novel approach to the classification of gene expression vectors in multiple classes that embodies vari-ous gene selection methods has been adopted for diagnostic purposes. The pro-posed system has been extensively tested on various, publicly available data-sets. We demonstrate its performance for prostate cancer diagnosis and corn-pare its performance with a well established multiclass classification scheme. The results show that the proposed system could be proved a valuable diagnostic aid in medicine.
Chapter PDF
References
Do, K.-A., Nikolova, R., Roebuck, P., Broom, B.: GeneClust, http://odin.mdacc.tmc.edu/~kim/geneclust/, accessed Nov. 2004
Hastie, T., Tibshirani, R., Eisen, M. B, Alizadeh, A., Levy, R., Staudt, L., Chan, W.C., Botstein D., and Brown, P.: ‘Gene shaving’ as a method for identifying distinct sets of genes with similar expression patterns. Gen. Biol. 1 (2000) 0003.1–0003.21
Li, C., Wong, W. H.: Model-based analysis of oligonucleotide arrays: Expression index computation and outlier detection. PNAS 98 (2001) 31–36
Peterson, L.E.: CLUSFAVOR 5.0: hierarchical cluster and principal-component analysis of microarray-based transcriptional profiles. Gen. Biol. 3 (2002) 0002.1–0002.8
Sturn, J. Quackenbush, Z. Trajanoski; Genesis: cluster analysis of microarray data. Bioinformaties 18 (2002) 207–208
Colantuoni, C., Henry, G., Zeger, S., Pevsner, J.: SNOMAD (Standardization and NOrmalization of MicroArray Data): web-accessible gene expression data analysis, Bioinformaties 18 (2002) 1540–1541
Saal, L. H., Troein, C., Vallon-Christersson, J., Gruvberger, S., Borg, Å., Peterson, C: Bio Array Software Environment: A Platform for Comprehensive Management and Analysis of MicroarrayData. Gen. Biol. 3 (2002) 0003.1–0003.6
Saeed, A.I., Sharov, V., White, J., Li, J., Liang, W., Bhagabati, N., Braisted, J., Klapa, M., Currier, T., Thiagarajan, M., Stum, A., Snuffln, M., Rezantsev, A., Popov, D., Ryltsov, A., Kostukovich, E., Borisovsky, I., Liu, Z., Vinsavich, A., Trush, V., Quackenbush, J.: TM4: a free, open-source system for microarray data management and analysis. Biotechniques 34 (2003) 374–378
Gentleman, R., Rossini, R., Dudoit S., Homik K.: The Bioconductor FAQ, (2003) official URL. http://www.bioconductor.org/
Yang, S., Murali, T. M., Pavlovic, V., Schaffer, M., Kasif, S.: RankGene: identification of diagnostic genes based on expression data. Bioinformaties. 19 (2003) 1578–1579
Xu, D., Olman, V., Wang, L., Xu, Y.: EXCAVATOR: a computer program for efficiently mining gene expression data. Nucleic Acids Research 31 (2003) 5582–5589
Toyoda T., Konagaya, A.: KnowledgeEditor: a new tool for interactive modeling and analyzing biological pathways based on microarray data. Bioinformaties. 19 (2003) 433–434
Pieler, R., Sanchez-Cabo, F., Hackl, H., Thallinger G.G., Trajanoski, Z.: ArrayNorm: comprehensive normalization and analysis of microarray data. Bioinformaties. 20 (2004) 1971–1973
Zhang, W., Shmulevich, I., (ed.), Computation and Statistical Approaches to Genomics, Kluwer Academic Publishers, Boston, (2002)
Troyanskaya, O., Cantor, M., Sherlock, G., Brown, P., Hastie, T., Tibshiran, R., Botstein D., Altman, R.B., Missing value estimation methods for DNA microarrays. Bioinformaties 17 (2001) 520–525
Pan, W., A comparative review of statistical methods for discovering differentially expressed genes in replicated microarray experiments. Bioinformaties. 18 (2002) 546–554
Golub, T.R. et al.: Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring. Science. 286 (1999), 531–537
Sun, M., Xiong, M.: A mathematical programming approach for gene selection and tissue classification. Bioinformaties 19 (2003) 1243–1251
Vapnik, V.: Statistical Learning Theory, John Will and Sons, New York, (1998)
Lapointe, J., Li, C., Higgins, J.P., Van de Rijn, M., Bair, E., Montgomery, K. et al.: Gene expression profiling identifies clinically relevant subtypes of prostate cancer. Proc. Nat. Acad. Sci. 101 (2004) 811–816
Stanford Microarray Database, http://genome-www5.stanford.edu.. accessed Nov. 2004.
Hsu C.W., Lin, C.J., A comparison of Methods for Multiclass Support Vector Machines, IEEE Trans. Neural Networks, 13 (2002), 415–425
Alon, U., Barkai, N., Notterman, D.A., Gish, K., Ybarra, S., Mack, D., Levine, A. J.: Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays, Proc. Nat. Acad. Sci. 96 (1999) 6745–6750.
Bhattacharjee, A., Richards, W.G., Staunton, J., Li, C., Monti, S., Vasa P., et al.:, Classification of human lung carcinomas by mRNA expression profiling reveals distinct adenocarcinoma subclasses. Proc. Nat. Acad. Sci. 98(24) (2001) 13790–13795
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 International Federation for Information Processing
About this paper
Cite this paper
Maroulis, D., Iakovidis, D., Flaounas, I., Karkanis, S. (2006). A gene expression analysis system for medical diagnosis. In: Maglogiannis, I., Karpouzis, K., Bramer, M. (eds) Artificial Intelligence Applications and Innovations. AIAI 2006. IFIP International Federation for Information Processing, vol 204. Springer, Boston, MA . https://doi.org/10.1007/0-387-34224-9_53
Download citation
DOI: https://doi.org/10.1007/0-387-34224-9_53
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-34223-8
Online ISBN: 978-0-387-34224-5
eBook Packages: Computer ScienceComputer Science (R0)