An Effective Feature Selection Algorithm Based on the Class Similarity Used with a SVM-RDA Classifier to Protein Fold Recognition
Feature selection is very important procedure in many pattern recognition problems. It is effective in reducing dimensionality, removing irrelevant data, and increasing accuracy of a classifier. In our previous work we propose a classifier combining the support vector machine (SVM) classifier with regularized discriminant analysis (RDA) classifier used to protein fold recognition problem. However high dimensionality of the feature vectors and small number of samples in the training data set caused that the problem is ill-posed for an RDA classifier and the feature selection is crucible for the accuracy of the classifier. In this paper we propose a simple and effective algorithm based on the class similarity which solves our problem and helps us to achieve very good acuracy on a real-world data set.
KeywordsFeature Selection Support Vectore Machine Statistical classifiers RDA classifier protein fold recognition
Unable to display preview. Download preview PDF.
- 2.Bologna, G., Appel, R.D.: A comparison study on protein fold recognition. In: Proceedings of the 9th ICONIP, Singapore, vol. 5, pp. 2492–2496 (2002)Google Scholar
- 3.Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines. Software (2001), http://www.csie.ntu.edu.tw/~cjlin/libsvm
- 7.Dubchak, I., Muchnik, I., Kim, S.H.: Protein folding class predictor for SCOP: approach based on global descriptors. In: Proceedings ISMB (1997)Google Scholar
- 11.Haindl, M., Somol, P., Ververidis, D., Kotropoulos, C.: Feature Selection Based on Mutual Correlation. In: Proceedings of Progress in Pattern Recognition, Image Analysis and Application, vol. 4225, pp. 569–577 (2006)Google Scholar
- 15.Liu, C.L., Fujisawa, H.: Classification and Learning for Character Recognition: Comparison of Methods and Remaining Problems. In: Proc. Int. Workshop on Neural Networks and Learning in Document Analysis and Recognition, Seoul, Korea (2005)Google Scholar
- 18.Okun, O.: Protein fold recognition with k-local hyperplane distance nearest neighbor algorithm. In: Proceedings of the Second European Workshop on Data Mining and Text Mining in Bioinformatics, Pisa, Italy, September 24, pp. 51–57 (2004)Google Scholar
- 19.Pal, N.R., Chakraborty, D.: Some new features for protein fold recognition. In: Artificial Neural Networks and Neural Information Processing ICANN/ICONIP, Turkey, Istanbul, June 26–29, vol. 2714, pp. 1176–1183 (2003)Google Scholar