Gene Selection and Classification Rule Generation for Microarray Dataset

  • Soumen Kumar Pati
  • Asit Kumar Das
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 178)


Microarray is a useful technique for measuring expression data of thousands or more of genes simultaneously. One of challenges in classification of cancer using high-dimensional gene expression data is to select a minimal number of relevant genes which can maximize classification accuracy. Because of the distinct characteristics inherent to specific cancerous gene expression profiles, developing flexible and robust gene identification methods is extremely fundamental. Many gene selection methods as well as their corresponding classifiers have been proposed. In the proposed method, a single gene with high class-discrimination capability is selected and classification rules are generated for cancer based on gene expression profiles.


Microarray cancer data K-means algorithm Gene selection Classification Rule Cancer sample identification 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Aerman, D.A., Gish, K., Ybarra, S., Mack, D., Levine, A.J.: Expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proc. Natl. Acad. Sci. 1, 6745–6750 (1999)Google Scholar
  2. 2.
    DeRisi, J., et al.: Use of a cDNA microarray to analyse gene expression patterns in human cancer. Nat. Genet. 14(4), 457–460 (1996)CrossRefGoogle Scholar
  3. 3.
    Muralidhar, K., Sarathy, R.: Security of random data perturbation methods. ACM Trans. Database Syst. 24(4), 487–493 (1999)CrossRefGoogle Scholar
  4. 4.
    Petrov, A., Shams, S.: Microarray image processing and quality control. VLSI Signal Processing 38(3), 211–226 (2004)CrossRefGoogle Scholar
  5. 5.
    Su, Y., Murali, T.M., Pavlovic, V., Schaffer, M., Kasif, S.: RankGene: identification of diagnostic genes based on expression data. Bioinformatics 19, 1578–1579 (2003)CrossRefGoogle Scholar
  6. 6.
    Li, L., Weinberg, R.C., Darden, T.A., Pedersen, L.G.: Gene selection for sample classification based on gene expression data: study of sensitivity to choice of parameters of the GA/KNN method. Bioinformatics 17, 1131–1142 (2001)CrossRefGoogle Scholar
  7. 7.
    Zhang, H., Yu, C.Y., Singer, B., Xiong, M.: Recursive partitioning for tumor classification with gene expression microarray data. PNAS 98, 6730–6735 (2001)CrossRefGoogle Scholar
  8. 8.
    Dudoit, S., Fridlyand, J., Speed, T.P.: Comparison of Discrimination Methods for the Classification of Tumors Using Gene Expression Data. J. Am. Statistical Assoc. 97(457), 77–87 (2002)MathSciNetMATHCrossRefGoogle Scholar
  9. 9.
    Wang, X., Gotoh, O.: Microarray-Based Cancer Prediction Using Soft Computing Approach. Cancer Informatics 7, 123–139 (2009)Google Scholar
  10. 10.
    Bradley, P.S., Bennett, K.P., Demiriz, A.: Constrained k-means clustering (Technical Report MSR-TR-2000-65), Microsoft Research, Redmond, WA (2000)Google Scholar
  11. 11.
    Pensa, R.G., Leschi, C., Besson, J., Boulicaut, J.: Assessment of discretization techniques for relevant pattern discovery from gene expression data. In: 4th Workshop on Data Mining in Bioinformatics (2004)Google Scholar
  12. 12.
    Peterson, I.: Fuzzy Sets. Science News 144, 55 (1993)CrossRefGoogle Scholar
  13. 13.
  14. 14.

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  1. 1.Department of Computer Science/Information TechnologySt. Thomas‘ College of Engineering and TechnologyKolkataIndia
  2. 2.Department of Computer Science and TechnologyBengal Engineering and Science UniversityHowrahIndia

Personalised recommendations