Microarray data are expected to be useful for cancer classification. However, the process of gene selection for the classification contains a major problem due to properties of the data such as the small number of samples compared with the huge number of genes (higher-dimensional data), irrelevant genes, and noisy data. Hence, this article aims to select a near-optimal (small) subset of informative genes that is most relevant for the cancer classification. To achieve this aim, an iterative approach based on genetic algorithms has been proposed. Experimental results show that the performance of the proposed approach is superior to other previous related work, as well as to four methods tried in this work. In addition, a list of informative genes in the best gene subsets is also presented for biological usage.
Gene selection Genetic algorithm Iterative approach Microarray data
Li S, Wu X, Hu X (2008) Gene selection using genetic algorithm and support vectors machines. Soft Comput 12:693–698CrossRefGoogle Scholar
Mohamad MS, Deris S, Illias RM (2005) A hybrid of genetic algorithm and support vector machine for features selection and classification of gene expression microarray. J Comput Intell Appl 5:1–17CrossRefGoogle Scholar
Saeys Y, Inza I, Larranaga P (2007) A review of feature selection techniques in bioinformatics. Bioinformatics 23(19):2507–2517CrossRefGoogle Scholar
Mohamad MS, Omatu S, Deris S, et al (2009) A multi-objective strategy in genetic algorithm for gene selection of gene expression data. Artif Life Robotics 13:410–413CrossRefGoogle Scholar
Peng S, Xu Q, Ling XB, et al (2003) Molecular classification of cancer types from microarray data using the combination of genetic algorithms and support vector machines. FEBS Lett 555:358–362CrossRefGoogle Scholar