Feature Selection and Ranking of Key Genes for Tumor Classification: Using Microarray Gene Expression Data
In this paper we perform a t-test for significant gene expression analysis in different dimensions based on molecular profiles from microarray data, and compare several computational intelligent techniques for classification accuracy on Leukemia, Lymphoma and Prostate cancer datasets of broad institute and Colon cancer dataset from Princeton gene expression project. Classification accuracy is evaluated with Linear genetic Programs, Multivariate Regression Splines (MARS), Classification and Regression Tress (CART) and Random Forests. Linear Genetic Programs and Random forests perform the best for detecting malignancy of different tumors. Our results demonstrate the potential of using learning machines in diagnosis of the malignancy of a tumor.
We also address the related issue of ranking the importance of input features, which is itself a problem of great interest. Elimination of the insignificant inputs (genes) leads to a simplified problem and possibly faster and more accurate classification of microarray gene expression data. Experiments on select cancer datasets have been carried out to assess the effectiveness of this criterion. Results show that using significant features gives the most remarkable performance and performs consistently well over microarray gene expression datasets we used. The classifiers used perform the best using the most significant features expect for Prostate cancer dataset.
KeywordsFeature Selection Classification Accuracy Random Forest Multivariate Adaptive Regression Spline Tumor Classification
Unable to display preview. Download preview PDF.
- 7.Armitage, P., Berry, G.: Statistical Methods in Medical Research. Blackwell, Malden (1994)Google Scholar
- 8.Salford Systems. TreeNet, CART, MARS, Random Forests ManualGoogle Scholar
- 10.Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and regression trees. Wadsworth and Brooks/Cole Advanced Books and Software (1986)Google Scholar
- 14.AIM Learning Technology, http://www.aimlearning.com