A Clustering-Based Method for Gene Selection to Classify Tissue Samples in Lung Cancer

  • José A. Castellanos-GarzónEmail author
  • Juan Ramos
  • Alfonso González-Briones
  • Juan F. de Paz
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 477)


This paper proposes a gene selection approach based on clustering of DNA-microarray data. The proposal has been aimed at finding a boundary gene subset coming from gene groupings imposed by a clustering method applied to the case study: gene expression data in lung cancer. Thus, we assume that such a found gene subset represents informative genes, which can be used to train a classifier by learning tumor tissue samples. To do this, we compare the results of several methods of hierarchical clustering to select the best one and then choose the most suitable clustering based on visualization techniques. The latter is used to compute its boundary genes. The results achieved from the case study have shown the reliability of this approach.


DNA-microarray Feature selection Data clustering Genetic algorithm Data mining Visual analytics 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Rothschild, S.I.: Advanced and metastatic lung cancer - what is new in the diagnosis and therapy. PRAXIS 104, 745–750 (2015)CrossRefGoogle Scholar
  2. 2.
    Wang, K.J., Melani, A., Chen, K.H., Wang, K.M.: A hybrid classifier combining borderline-SMOTE with AIRS algorithm for estimating brain metastasis from lung cancer: A case study in taiwan. Computer Methods and Programs in Biomedicine 119, 63–76 (2015)CrossRefGoogle Scholar
  3. 3.
    Berrar, D.P., Dubitzky, W., Granzow, M.: A Practical Approach to Microarray Data Analysis. Kluwer Academic Publishers, New York (2003)CrossRefzbMATHGoogle Scholar
  4. 4.
    Castellanos-Garzón, J.A., García, C.A., Novais, P., Díaz, F.: A visual analytics framework for cluster analysis of DNA microarray data. Expert Systems with Applications, Elsevier 40, 758–774 (2013)CrossRefGoogle Scholar
  5. 5.
    Lazar, C., Taminau, J., Meganck, S., Steenhoff, D., Coletta, A., Molter, C., deSchaetzen, V., Duque, R., Bersini, H., Nowé, A.: A survey on filter techniques for feature selection in gene expression microarray analysis. IEEE/ACM Transactions On Computational Biology And Bioinformatics 9(4), 1106–1118 (2012)CrossRefGoogle Scholar
  6. 6.
    Xia, C., Hsu, W., Lee, M.L., Ooi, B.C.: Border: Efficient computation of boundary points. IEEE Transactions on Knowledge and Data Engineering 18, 289–303 (2006)CrossRefGoogle Scholar
  7. 7.
    Jain, A.K., Murty, N.M., Flynn, P.J.: Data clustering: A review. ACM Computing Surveys 31(3), 264–323 (1999)CrossRefGoogle Scholar
  8. 8.
    Weiss, P.: Applications of generating functions in nonparametric tests. The Mathematica Journal 9(4), 803–823 (2005)Google Scholar
  9. 9.
    Eisen, M., Spellman, T., Brown, P., Botstein, D.: Cluster analysis and display of genome-wide expression patterns. Proceedings of the National Academy of Sciences, USA 95, 14863–14868 (1998)CrossRefGoogle Scholar
  10. 10.
    Chipman, H., Tibshirani, R.: Hybrid hierarchical clustering with applications to microarray data. Biostatistics 7, 302–317 (2006)zbMATHGoogle Scholar
  11. 11.
    Castellanos-Garzón, J.A., Díaz, F.: An evolutionary computational model applied to cluster analysis of DNA microarray data. Expert Systems with Applications, Elsevier 40, 2575–2591 (2013)CrossRefGoogle Scholar
  12. 12.
    Kuner, R., Muley, T., Meister, M., Ruschhaupt, M., Buness, A., Xu, E., Schnabel, P., Warth, A., Poustka, A., Sltmann, H., Hoffmann, H.: Global gene expression analysis reveals specific patterns of cell junctions in non-small cell lung cancer subtypes. Lung Cancer 63(1), 32–38 (2009)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • José A. Castellanos-Garzón
    • 1
    Email author
  • Juan Ramos
    • 1
  • Alfonso González-Briones
    • 1
  • Juan F. de Paz
    • 1
  1. 1.Faculty of Science, Biomedical Research Institute of Salamanca/BISITE Research Group, Edificio I+D+i USALUniversity of SalamancaSalamancaSpain

Personalised recommendations