A Scalable Feature Selection Method to Improve the Analysis of Microarrays
DNA microarray experiments are used to collect information from tissue and cell samples regarding gene expression differences that are useful for diagnosis and treatment of many different diseases. The predictive accuracy is hindered by the large dimensionality of these datasets and the existence of irrelevant and redundant features. The performance of a feature selection process could improve the classification accuracy of this demanding research field.
However, standard feature selection method performance may be very poor in high-dimensional microarray data. We propose a scalable evolutionary method to select relevant genes. We use a divide-and-conquer approach to deal with the scalability issues of the evolutionary algorithms, and a combination of different rounds of feature selection to increase the accuracy results and storage reduction. Our proposal improves the results of standard classifiers and feature selection methods in accuracy and storage reduction for 8 different microarray datasets.
KeywordsGenetic Algorithm Feature Selection Feature Selection Method Microarray Dataset Feature Selection Algorithm
Unable to display preview. Download preview PDF.
- 3.Dash, M., Choi, K., Scheuermann, P., Liu, H.: Feature Selection for Clustering - A Filter Solution. In: Proceedings of the Second International Conference on Data Mining, pp. 115–122 (2002)Google Scholar
- 4.Ding, Y., Wilkins, D.: Improving the performance of svm-rfe to select genes in microarray data. BMC Bioinformatics 7(suppl. 2), S12 (2006)Google Scholar
- 6.de Haro-García, A., García-Pedrajas, N.: Scaling up feature selection by means of democratization. In: García-Pedrajas, N., Herrera, F., Fyfe, C., Benítez, J.M., Ali, M. (eds.) IEA/AIE 2010. LNCS, vol. 6097, pp. 662–672. Springer, Heidelberg (2010), http://portal.acm.org/citation.cfm?id=1945847.1945926 CrossRefGoogle Scholar
- 7.Kim, Y., Street, W.N., Menczer, F.: Feature selection in unsupervised learning via evolutionary search. In: The 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 365–369. ACM Press (2000)Google Scholar
- 10.Somorjai, R.L., Dolenko, B., Baumgartner, R.: Class prediction and discovery using gene microarray and proteomics mass spectroscopy data: curses, caveats, cautions (2003)Google Scholar