ICANN 2003, ICONIP 2003: Artificial Neural Networks and Neural Information Processing — ICANN/ICONIP 2003 pp 35-42 | Cite as
Selecting Salient Features for Classification Committees
Abstract
We present a neural network based approach for identifying salient features for classification in neural network committees. Our approach involves neural network training with an augmented cross-entropy error function. The augmented error function forces the neural network to keep low derivatives of the transfer functions of neurons of the network when learning a classification task. Feature selection is based on two criteria, namely the reaction of the cross-validation data set classification error due to the removal of the individual features and the diversity of neural networks comprising the committee. The algorithm developed removed a large number of features from the original data sets without reducing the classification accuracy of the committees. By contrast, the accuracy of the committees utilizing the reduced feature sets was higher than those exploiting all the original features.
Preview
Unable to display preview. Download preview PDF.
References
- 1.LeCun, Y.: Optimal brain damage. In Touretzky, D.S., ed.: Neural Information Processing Systems. Morgan Kaufmann, San Mateo, CA (1990) 598–605Google Scholar
- 2.Verikas, A., Lipnickas, A., Malmqvist, K., Bacauskiene, M., Gelzinis, A.: Soft combination of neural classifiers: A comparative study. Pattern Recognition Letters 20 (1999) 429–444CrossRefGoogle Scholar
- 3.Verikas, A., Lipnickas, A., Bacauskiene, M., Malmqvist, K.: Fusing neural networks through fuzzy integration. In Bunke, H., Kandel, A., eds.: Hybrid Methods in Pattern Recognition. World Scientific (2002) 227–252Google Scholar
- 4.Optitz, D.W., Shavlik, J.W.: Generating accurate and diverse members of a neural-network ensemble. In Touretzky, D.S., Mozer, M.C., Hasselmo, M.E., eds.: Advances in Neural Information Processing Systems. Volume 8. MIT Press (1996) 535–541Google Scholar
- 5.Breiman, L.: Bagging predictors. Technical Report 421, Statistics Departament, University of California, Berkeley (1994)Google Scholar
- 6.Avnimelech, R., Intrator, N.: Boosting regression estimators. Neural Computation 11 (1999) 499–520CrossRefGoogle Scholar
- 7.Freund, Y., Schapire, R.E.: A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences 55 (1997) 119–139MATHCrossRefMathSciNetGoogle Scholar
- 8.Breiman, L.: Half & Half bagging and hard boundary points. Technical Report 534, Statistics Departament, University of California, Berkeley (1998)Google Scholar
- 9.Margineantu, D., Dietterich, T.G.: Pruning adaptive boosting. In: Proceedings of the 14th Machine Learning Conference, San Francisco, Morgan Kaufmann (1997) 211–218Google Scholar
- 10.Setiono, R., Liu, H.: Neural-network feature selector. IEEE Transactions on Neural Networks 8 (1997) 654–662CrossRefGoogle Scholar
- 11.Bauer, K.W., Alsing, S.G., Greene, K.A.: Feature screening using signal-to-noise ratios. Neurocomputing 31 (2000) 29–44CrossRefGoogle Scholar
- 12.Jeong, D.G., Lee, S.Y.: Merging back-propagation and hebian learning rules for robust classifications. Neural Networks 9 (1996) 1213–1222CrossRefGoogle Scholar