Abstract
Diagnosis of cancer is one of the most emerging clinical applications in microarray gene expression data. However, cancer classification on microarray gene expression data still remains a difficult problem. The main reason for this is the significantly large number of genes present relatively compared to the number of available training samples. In this paper, we propose a hybrid feature selection approach that combines the correlation coefficient with particle swarm optimization. The process of feature selection and classification is performed on three multi-class datasets namely Lymphoma, MLL and SRBCT. After the process of feature selection is performed, the selected genes are subjected to Extreme Learning Machines Classifier. Experimental results show that the proposed hybrid approach reduces the number of effective levels of gene expression and obtains higher classification accuracy and uses fewer features compared to the same experiment performed using the traditional tree-based classifiers like J48, random forest, random trees, decision stump and genetic algorithm as well.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Mitra, S., Das, R., Hayashi, Y.: Genetic networks and soft computing. IEEE/ACM Trans. Comput. Biol. Bioinform. 8(1) (2011)
Yang, C.-S., Chuang, L.-Y., Ke, C.-H., Yang, C.-H.: A hybrid feature selection method for microarray classification. IAENG Int. J. Comput. Sci. 21 (2008)
Yang, C.-S., Chuang, L.-Y., Yang, C.-H., IG-GA: a hybrid filter/wrapper method for feature selection of microarray data. J. Med. Biol. Eng. 30(1), 23–28
Maji, P., Das, C.: Relevant and significant supervised gene clusters for microarray cancer classification. IEEE Trans. Nano Biosci. 11(2) (2012)
Chuang, L.-Y., Chang, H.-W., Tu, C.-J., Yang, C.-H.: Improved binary PSO for feature selection using gene expression data. Comput. Biol. Chem. 32(1), 29–38 (2008)
Sharma, A., Imoto, S., Miyano, S.: A top-r feature selection algorithm for microarray gene expression data. IEEE/ACM Trans. Comput. Biol. Bioinform. 9(3) (2012)
Sakellariou, A., Sanoudou, D., Spyrou, G.: Investigating the minimum required number of genes for the classification of neuromuscular disease microarray data. IEEE Trans. Inform. Technol. Biomed. 15(3) (2011)
Rajapakse, J.C., Mundra, P.A.: Multiclass Gene selection using pareto-fronts. IEEE/ACM Trans. Comput. Biol. Bioinform. 10(1) (2013)
Wang, J., Zhao, P., Hoi, S.C.H., Jin, R.: Online feature selection and its applications. IEEE Trans. Knowl. Data Eng. 26(3) (2014)
Song, Q., Ni, J., Wang, G.: A fast clustering-based feature subset selection algorithm for high-dimensional data. IEEE Trans. Knowl. Data Eng. 25(1) (2013)
Liu, S., Patel, R.Y., Daga, P.R., Liu, H., Fu, G., Doerksen, R.J., Chen, Y., Wilkins, D.E.: Combined rule extraction and feature elimination in supervised classification. IEEE Trans. Nano Biosci. 11(3) (2012)
Leung, Y., Hung, Y.: A Multiple-filter-multiple-wrapper approach to gene selection and microarray data classification. IEEE/ACM Trans. Comput. Biol. Bioinform. 7(1) (2010)
Ji, G., Yang, Z., You, W.: PLS-based gene selection and identification of tumor-specific Genes. IEEE Trans. Syst. Man Cybern.—Part C: Appl. Rev. 41(6) (2011)
Karegowda, A.G., Manjunath, A.S., Jayaram, M.A.: Comparative study of attribute selection using gain ratio and correlation based feature selection. Int. J. Inform. Technol. Knowl. Manage. 2(2), 271–277 (2010)
Hall, M.A.: Correlation-based Feature Selection for Machine Learning. University of Waikato (1999)
Lazar, C., Taminau, J., Meganck, S., Steenhoff, D., Coletta, A., Molter, C., de Schaetzen, V., Duque, R., Bersini, H., Nowe, A.: A survey on filter techniques for feature selection in gene expression microarray analysis. IEEE/ACM Trans. Comput. Biol. Bioinform. 9(4) (2012)
Fu, L.M., Youn, E.S.: Improving reliability of gene selection from microarray functional genomics data. IEEE Trans. Inform. Technol. Biomed. 7(3) (2003)
da Costa, J.F.P., Alonso, H., Roque, L.: A weighted principal component analysis and its application to gene expression data. IEEE/ACM Trans. Comput. Biol. Bioinform. 8(1) (2011)
Kumar, A.P., Valsala, P.: Bioinformation 9(16), 824–828 (2013)
Kar, S., Sharma, K.D., Maitra, M.: Gene selection from microarray gene expression data for classification of cancer subgroups employing PSO and adaptive K-nearest neighborhood technique. Expert Syst. Appl. 612–627 (2015)
Huang, G.-B., Zhu, Q.-Y., Siew, C.-K.: Extreme learning machine: a new learning scheme of feedforward neural networks. In: Proceedings of IEEE International Joint Conference on Neural Networks, vol. 2 (2004)
Wang, Y., Cao, F., Yuan, Y.: A study on effectiveness of extreme learning machine. Neurocomputing, Elsevier
Zhang, R., Huang, G.-B., Sundararajan, N., Saratchandran, P.: Multicategory classification using an extreme learning machine for microarray gene expression cancer diagnosis. IEEE/ACM Trans. Comput. Biol. Bioinform. 4(3) (2007)
Lu, H.-J., An, C.-L., Zheng, E.-H., Lu, Y.: Dissimilarity based ensemble of extreme learning machine for gene expression data classification. Neurocomputing (2014)
Yoon, H., Park, C.-S., Kim, J.S., Baek, J.-G.: Algorithm learning based neural network integrating feature selection and classification. Expert Syst. Appl. (2013)
Chandrasekar, C., Meena, P.S.: Microarray Gene expression for cancer classification using fast extreme learning machine with ANP. Int. J. Eng. Res. Appl. 2(2), 229–235 (2012)
Arunkumar, C., Ramakrishnan, S.: Binary Classification of cancer microarray gene expression data using extreme learning machines. In: 2014 IEEE International Conference on Computational Intelligence and Computing Research, pp. 1–4 (2014)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Chinnaswamy, A., Srinivasan, R. (2016). Hybrid Feature Selection Using Correlation Coefficient and Particle Swarm Optimization on Microarray Gene Expression Data. In: Snášel, V., Abraham, A., Krömer, P., Pant, M., Muda, A. (eds) Innovations in Bio-Inspired Computing and Applications. Advances in Intelligent Systems and Computing, vol 424. Springer, Cham. https://doi.org/10.1007/978-3-319-28031-8_20
Download citation
DOI: https://doi.org/10.1007/978-3-319-28031-8_20
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-28030-1
Online ISBN: 978-3-319-28031-8
eBook Packages: EngineeringEngineering (R0)