Microarray Data Analysis Using Neural Network Classifiers and Gene Selection Methods

Zheng, Gaolin; Olusegun George, E.; Narasimhan, Giri

doi:10.1007/0-387-23077-7_16

Gaolin Zheng²,
E. Olusegun George³ &
Giri Narasimhan

611 Accesses
3 Citations

Abstract:

Different research groups have conducted independent gene expression studies on tissue samples from human lung adenocarcinomas [Bhattacharjee et al. 2001; Beer et al. 2002]. In this paper we (a) investigate methods to integrate data obtained from independent studies, (b) experiment with different gene selection methods to find genes that have significantly differential expression among different tumor stages, (c) study the performance of neural network classifiers with correlated weights, and (d) compare the performance of classifiers based on neural networks and its many variants on gene expression data. Raw cell intensity data were preprocessed for our analyses. Affymetrix array comparison spreadsheets were used to extract the overlapping probe sets for the data integration study. We considered neural network classifiers with random weights selected from a univariate normal distribution and optimized using Bayesian methods. The performance of the neural network was further enhanced using ensemble techniques such as bagging and boosting. The performance of all the resulting classifiers was compared using the Michigan and Harvard data sets from the CAMDA website. Three gene selection methods were used to find significant genes that could discriminate between the various stages of lung cancer. Significant genes, which were mined from the Gene Ontology (GO) database using the GoMiner and AmiGO packages, were found to be involved in apoptosis, angiogenesis, and cell growth and differentiation. Neural networks enhanced with bagging exhibited the best performance among all the classifiers we tested.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

REFERENCES

Ando, T., M. Suguro, T. Hanai, T. Kobayashi, H. Honda and M. Seto (2002). “Fuzzy neural network applied to gene expression profiling for predicting the prognosis of diffuse large B-cell lymphoma.” Japanese Journal of Cancer Research 93(11): 1207–12.
PubMed CAS Google Scholar
Ashburner, M., C. A. Ball, J. A. Blake, D. Botstein, H. Butler, J. M. Cherry, A. P. Davis, K. Dolinski, S. S. Dwight, J. T. Eppig, M. A. Harris, D. P. Hill, L. Issel-Tarver, A. Kasarskis, S. Lewis, J. C. Matese, J. E. Richardson, M. Ringwald, G. M. Rubin and G. Sherlock (2000). “Gene Ontology: tool for the unification of biology.” Nature Genetics 25: 25–29.
PubMed CAS Google Scholar
Beer, D. G., S. L. R. Kardia, C.-C. Huang, T. J. Giordano, A. M. Levin, D. E. Misek, L. Lin, G. Chen, T. G. Gharib, D. G. Thomas, M. L. Lizyness, R. Kuick, S. H. Hayasaka, J. M. G. Taylor, M. D. Iannettoni, M. B. Orringer and S. Hanash (2002). “Gene-expression profiles predict survival of patients with lung adenocarcinoma.” Nature Medicine 8(8): 816–24.
PubMed CAS Google Scholar
Bhattacharjee, A., W. G. Richards, J. Staunton, C. Li, S. Monti, P. Vasa, C. Ladd, J. Beheshti, R. Bueno, M. Gillette, M. Loda, G. Weber, E. J. Mark, E. S. Lander, W. Wong, B. E. Johnson, T. R. Golub, D. J. Sugarbaker and M. Meyerson (2001). “Expression profiling reveals distinct adenocarcinoma subclasses.” PNAS 98(24): 13790–13795.
Article PubMed CAS Google Scholar
Breiman, L. (1996). “Bagging predictors.” Machine Learning J. 246(2): 123–40.
Google Scholar
Grey, S., S. Dlay, B. Leone, F. Cajone and G. Sherbet (2003). “Prediction of nodal spread of breast cancer by using artificial neural network-based analyses of S100A4, nm23 and steroid receptor expression.” Clin Exp Metastasis 20(6): 507–14.
Article PubMed CAS Google Scholar
Irizarry, R., B. Hobbs, F. Collin, Y. Beazer-Barclay, K. Antonellis, U. Scherf and T. Speed (2003). “Exploration, normalization, and summaries of high density oligonucleotide array probe level data.” Biostatistics 4(2): 249–264.
Article PubMed Google Scholar
Japkowicz, N. (2000). Class imbalance problem: significance and strategies. International Conference on Artificial Intelligence (IC-AI’2000): Special Track on Inductive Learning, Las Vegas.
Google Scholar
Khan, J., J. S. Wei, M. Ringner, L. H. Saal, M. Ladanyi, F. Westermann, F. Berthold, M. Schwab, C. R. Antonescu, C. Peterson and P. S. Meltzer (2001). “Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks.” Nat Med 7(6): 673–9.
Article PubMed CAS Google Scholar
Li, C. and W. H. Wong (2001). “Model-based analysis of oligonucleotide arrays: Expression index computation and outlier detection.” PNAS 98(1): 31–36.
Article PubMed CAS Google Scholar
Mateos, A., J. Herrero, J. Tamames and J. Dopazo (2002). Supervised Neural Networks for Clustering Conditions in DNA Array Data after Reducing Noise by Clustering Gene Expression Profiles. Methods of Microarray Data Analysis II. S. M. Lin and K. F. Johnson. Boston, Kluwer Academic Publishers.
Google Scholar
Schapire, R. E. (1990). “The strength of weak learnability.” Machine Learning J. 5(2): 197–227.
Google Scholar
Singhal, S., C. G. Kyvernitis, S. W. Johnson, L. R. Kaiser, M. N. Liebman and S. M. Albelda (2003). `“MicroArray Data Simulator For Improved Selection of Differentially Expressed Genes.” Cancer Biology & Therapy 2(4): 383–391.
CAS Google Scholar
Tusher, V. G., R. Tibshirani and G. Chu (2001). “Significance analysis of microarrays applied to the ionizing radiation response.” PNAS 98(9): 5116–5121.
Article PubMed CAS Google Scholar
Zeeberg, B. R., W. Feng, G. Wang, M. D. Wang, A. T. Fojo, M. Sunshine, S. Narasimhan, D. W. Kane, W. C. Reinhold, S. Lababidi, K. J. Bussey, J. Riss, J. C. Barrett and J. N. Weinstein (2003). “GoMiner: A Resource for Biological Interpretation of Genomic and Proteomic Data.” Genome Biology 4(4): R28.
Article Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Science, Florida International University, Miami, FL, 33199
Gaolin Zheng
Mathematical Sciences Department, University of Memphis, Memphis, TN, 38152
E. Olusegun George

Authors

Gaolin Zheng
View author publications
You can also search for this author in PubMed Google Scholar
E. Olusegun George
View author publications
You can also search for this author in PubMed Google Scholar
Giri Narasimhan
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Duke Bioinformatics Shared Resource, Duke University Medical Center, Durham, NC, USA
Jennifer S. Shoemaker & Simon M. Lin &

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Zheng, G., Olusegun George, E., Narasimhan, G. (2005). Microarray Data Analysis Using Neural Network Classifiers and Gene Selection Methods. In: Shoemaker, J.S., Lin, S.M. (eds) Methods of Microarray Data Analysis. Springer, Boston, MA. https://doi.org/10.1007/0-387-23077-7_16

Download citation

DOI: https://doi.org/10.1007/0-387-23077-7_16
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-23074-0
Online ISBN: 978-0-387-23077-1
eBook Packages: Biomedical and Life SciencesBiomedical and Life Sciences (R0)

Publish with us

Policies and ethics