Abstract
In this paper we compare various applications of supervised and unsupervised neural networks to the analysis of the gene expression profiles produced using DNA microarrays. In particular we are interested in the classification of samples or conditions. We have found that if gene expression profiles are clustered at the optimal level, the classification of conditions obtained using the average gene expression profile of each cluster is better than that obtained directly using all the gene expression profiles. If a supervised method (a back propagation neural network) is used instead of an unsupervised method, the efficiency of the classification of conditions increases. We studied the relative efficiencies of different clustering methods for reducing the dimensionality of the gene expression profile data set and found that the Self-Organising Tree Algorithm (SOTA) is a good choice for this task.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Alizadeh, AA, Eisen, MB, Davis, RE, Ma, C, Lossos, IS, Rosenwald, A, Boldrick, JC, Sabet, H, Tran, T, Yu, X, Powell, JI, Yang, L, Marti, GE, Moore, T, Hudson, J Jr., Lu, L, Lewi,s DB, Tibshirani, R, Sherlock, G, Chan, WC, Greiner, TC, Weisenburger, DD, Armitage, JO, Warnke, R, Levy, R, Wilson, W, Grever, MR, Byrd, JC., Botstein, D, Brown, PO, Staudt LM. Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature 403 (2000):503ā511
Alon, U, Barkai, N, Notterman, DA, Gish, K., Ybarra, S, Mack, D, Levine, AJ. Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed with oligonucleotide arrays. Proc Natl Acad Sci USA 96 (1999): 6745ā6750.
Brown, PO, Botsein, D. Exploring the new world of the genome with DNA microarrays. Nat Biotechnol 14 (1999): 1675ā1680.
Brown, MPS, Grundy, WN, Lin, D, Cristianini, N, Sugnet, CW, Furey, TS, Ares, M, Haussler, D. Knowledge-based analysis of microarray gene expression data using support vector machines. Proc Natl Acad Sci USA 97 (2000): 262ā267.
Dopazo, J, Carazo, JM. Phylogenetic reconstruction using a growing neural network that adopts the topology of a phylogenetic tree. J Mol Evol 44 (1997): 226ā233.
Dopazo, J, Zanders, E, Dragoni, I, Amphlett, G, Falciani, F. Methods and approaches in the analysis of gene expression data. J. Immunol Meth 250 (2001): 93ā112.
Efron, B, Tibsirani, R. Statistical data analysis in the computer age. Science 253 (1991): 390ā395.
Eisen, M, Spellman, PL, Brown, PO, Botsein, D. Cluster analysis and display of genomewide expression patterns. Proc Natl Acad Sci USA 95 (1998): 14863ā14868.
Furey, TS, Cristianini, N, Duffy, N, Bednarski, DW, Schummer, M, Haussler, D. Support vector machine classification and validation of cancer tissue samples using microarray expression data. Bioinformatics 16 (2000): 906ā914.
Hand, DJ. Discrimination and classification, NY: Wiley, 1981.
Hartigan, JA. Clustering algorithms. New York: Wiley, 1975.
Herrero, J, Valencia, A, Dopazo, J. A hierarchical unsupervised growing neural network for clustering gene expression patterns. Bioinformatics 17 (2001): 126ā136.
Khan, J, Wei, JS, RingnĆ©r, M, Saal, LH, Ladanyi, M, Westermann, F, Berthold, F, Schwab, M, Antonescu, CR, Peterson, C, Meltzer, PS. Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks. Nature Med 7 (2001): 673ā579.
Kohonen, T. Self-organizing maps. Berlin: Springer-Verlag, 1997.
Scherf, U, Ross, DT, Waltham, M, Smith, LH, Lee, JK, Tanabe, L, Kohn, KW, Reinhold, WC, Myers, TG, Andrews, DT, Scudiero, DA, Eisen, MB, Sausville, EA, Pommier, Y, Botstein, D, Brown, PO, Weinstein, JN. A gene expression database for the molecular pharmacology of cancer. Nat Genet 24 (2000): 236ā44.
Tamayo, P, Slonim, D, Mesirov, J, Zhu, Q, Kitareewan, S, Dmitrovsky, E, Lander, ES, Golub TR. Interpreting patterns of gene expression with self-organizing maps: methods and application to hematopoietic differentiation. Proc Natl Acad Sci USA 96 (1999): 2907ā2912.
Tƶrƶnen, P, Kolehmainen, M, Wong, G, CastrĆ©n, E. Analysis of gene expression data using self-organizing maps. FEBS letters 451 (1999): 142ā146.
Troyanskaya, O, Cantor, ML, Sherlock, G, Brown, P, Hastie, T, Tibshirani, R, Botstein, D, Altman, RB. Missing value estimation methods for DNA microarrays. Bioinformatics 17 (2001): 520ā525.
Wen, X, Fuhrman, S, Michaels, GS, Carr, DB, Smith, S, Barker, JL, Somogyi, R. Large-scale temporal gene expression mapping of central nervous system development. Proc.Natl Acad Sci USA 95 (1998): 334ā339.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
Ā© 2002 Kluwer Academic Publishers
About this chapter
Cite this chapter
Mateos, A., Herrero, J., Tamames, J., Dopazo, J. (2002). Supervised Neural Networks for Clustering Conditions in DNA Array Data After Reducing Noise by Clustering Gene Expression Profiles. In: Lin, S.M., Johnson, K.F. (eds) Methods of Microarray Data Analysis II. Springer, Boston, MA. https://doi.org/10.1007/0-306-47598-7_7
Download citation
DOI: https://doi.org/10.1007/0-306-47598-7_7
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4020-7111-9
Online ISBN: 978-0-306-47598-6
eBook Packages: Springer Book Archive