Genetic Algorithm and Neural Network Based Classification in Microarray Data Analysis with Biological Validity Assessment

  • Vitoantonio Bevilacqua
  • Giuseppe Mastronardi
  • Filippo Menolascina
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4115)


Microarrays allow biologists to better understand the interactions between diverse pathologic states at the gene level. However, the amount of data generated by these tools becomes problematic. New techniques are then needed in order to extract valuable information about gene activity in sensitive processes like tumor cells proliferation and metastasis activity. Recent tools that analyze microarray expression data have exploited correlation-based approach such as clustering analysis. Here we describe a novel GA/ANN based method for assessing the importance of genes for sample classification based on expression data. Several different approaches have been exploited and a com-parison has been given. The developed system has been employed in the classification of ER+/- metastasis recurrence of breast cancer tumours and results were validated using a real life database. Further validation has been carried out using Gene Ontology based tools. Results proved the valuable potentialities and robustness of similar systems.


Genetic Algorithm Gene Ontology Artificial Immune System Microarray Data Analysis Soft Computing Technique 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Golub, T.R., Slonim, D.R., Tamayo, P., et al.: Molecular Classification of Cancer: Class Discovery and Prediction by Gene Expression Monitoring. Science 286 (1999)Google Scholar
  2. 2.
    Alizadeh, A.A., Eisen, M.B., et al.: Distinct Types of Diffuse Large B-Cell Lymphoma Identified by Gene Expression Profiling. Nature 403, 503–511 (2000)CrossRefGoogle Scholar
  3. 3.
    Wang, Y., Klijn, J.G.M., Zhang, Y., et al.: Gene-expression Profiles to Predict Distant Metastasis of Lymph-node-negative Primary Breast Cancer. Lancet (2005)Google Scholar
  4. 4.
    Foekens, J.A.: Multi-center Validation of a Gene Expression Based Prognostic Signature in Lymph Node-Negative Primary Breast Cancer (to appear)Google Scholar
  5. 5.
    Zhang, M.: Extracting Functional Information from Microarrays: A Challenge for Functional Genomics. PNAS 99(20), 12509–12511 (2002)CrossRefGoogle Scholar
  6. 6.
    Chakraborty, A., Maka, H.: Biclustering of Gene Expression Data Using Genetic Algorithm. In: CIBCB (2005)Google Scholar
  7. 7.
    Juliusdottir, T., Corne, D., Keedwell, E., Narayanan, A.: Two-Phase EA/k-NN for Feature Selection and Classification in Cancer Microarray Datasets. In: CIBCB (2005)Google Scholar
  8. 8.
    Bolstad, B.M., Irizarry, R.A., Åstrand, M., Speed, T.P.: A Comparison of Normalization Methods for High Density Oligonucleotide Array Data Based on Variance and Bias. Bioinformatics 19(2) (2003)Google Scholar
  9. 9.
    Virginie, M.A., Cody, M.J., Cheng, J., Dermody, J.J., Soteropoulos, P., Recce, M., Tolias., P.P.: Noise Filtering and Nonparametric Analysis of Microarray Data Underscores Discriminating Markers of Oral, Prostate, Lung, Ovarian and Breast Cancer. BMC Bioinformatics (2004)Google Scholar
  10. 10.
    Kohane, I.S., Kho, A.T., Butte, A.J.: Microarrays for an Integrative Genomics. MIT Press, Cambridge (2003)Google Scholar
  11. 11.
    Kalyanmoy, D., Raji, R.: Classification of Two and Multi-Class Cancer Data Reliably Using Multi-objective Evolutionary Algorithms. KanGAL Report No. 2003006Google Scholar
  12. 12.
    Li, L., et al.: Gene Assessment and Sample Classification for Gene Expression Data Using a Genetic Algorithm/k-nearest Neighbor Method. Combinatorial Chemistry and High Throughput Screening, 727–739 (2001)Google Scholar
  13. 13.
    Cristianini, N., Shawe-Taylor, J.: An Introduction to Support Vector Machines (And Other Kernel-Based Learning Methods). Cambridge University Press, Cambridge (2000)Google Scholar
  14. 14.
    Bevilacqua, V., Mastronardi, G., Menolascina, F.: Hybrid Data Analysis Methods and Artificial Neural Network Desing in Breast Cancer Diagnosis: IDEST experience. In: CIMCA ( in press, 2005)Google Scholar
  15. 15.
    Zhong, S., Storch, F., Lipan, O., Kao, M.J., Weitz, C., Wong, W.H.: GoSurfer: A Graphical Interactive Tool for Comparative Analysis of Large Gene Sets in Gene Ontology Space. Applied Bioinformatics, An Introduction to Support Vector Machines (And Other Kernel-Based Learning Methods), An Introduction to Support Vector Machines (And Other Kernel-Based Learning Methods) (2004)Google Scholar
  16. 16.
  17. 17.
    Bonnotte, B., Favre, N., Moutet, M., Fromentin, A., Solary, E., Martin, M., Martin, F.: Role of Tumor Cell Apoptosis in Tumor Antigen Migration to the Draining Lymph Nodes. Journal of Immunoly, 1995–2000 (2000)Google Scholar
  18. 18.
    Tang, K., Ponnuthurai, N.S., Xin, Y.: Feature Selection for Microarray Data Using Least Squares SVM and Particle Swarm Optimization, CIBCB (2005)Google Scholar
  19. 19.
    Huang, H., Cheng, S.E.C., et al.: Gene Expression Predictors of Breast Cancer Outcome. Lancet 361, 1590–1596 (2003)CrossRefGoogle Scholar
  20. 20.
    Dasgupta, D.: Artificial Neural Networks and Artificial Immune Systems: Similarities and Differences. In: Proc. of the IEEE SMC, vol. 1, pp. 873–878 (1997)Google Scholar
  21. 21.
    Watkins, A., Timmis, J., Boggess, L.: Artificial Immune Recognition System (AIRS): An Immune Inspired Supervised Machine Learning Algorithm. Genetic Programming and Evolvable Machines (2004)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Vitoantonio Bevilacqua
    • 1
  • Giuseppe Mastronardi
    • 1
  • Filippo Menolascina
    • 1
  1. 1.Dipartimento di Elettrotecnica ed ElettronicaPolytechnic of BariBariItaly

Personalised recommendations