An Excellent Feature Selection Model Using Gradient-Based and Point Injection Techniques

  • D. Huang
  • Tommy W. S. Chow
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4233)


This paper focuses on enhancing the effectiveness of filter feature selection models from two aspects. One is to modify feature searching engines based on optimization theory, and the other is to improve the regularization capability using point injection techniques. The second topic is undoubtedly important in the situations where overfitting is likely to be met, for example, the ones with only small sample sets available. Synthetic and real data are used to demonstrate the contribution of our proposed strategies.


Feature Selection Feature Subset Point Injection Classification Learning Filter Model 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Al-Ani, A., Deriche, M.: Optimal feature selection using information maximisation: case of biomedical data. In: Proc. of the 2000 IEEE Signal Processing Society Workshop, vol. 2, pp. 841–850 (2000)Google Scholar
  2. Alon, U., et al.: Broad pattern of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proc. Natl. Acad. Sci. USA 96(12), 6745–6750 (1999)CrossRefGoogle Scholar
  3. Battiti, R.: Using mutual information for selecting features in supervised neural net learning. IEEE Trans. Neural Networks 5, 537–550 (1994)CrossRefGoogle Scholar
  4. Bishop, C.M.: Neural Networks for Pattern Recognition. Oxford University Press, New York (1995)Google Scholar
  5. Bonnlander, B.: Nonparametric Selection of Input Variables for Connectionist Learning, Ph.D. thesis, CU-CS-812-96, University of Colorado at Boulder (1996)Google Scholar
  6. Chow, T.W.S., Huang, D.: Estimating optimal feature subsets using efficient estimation of high-dimensional mutual information. IEEE Trans. Neural Networks 16(1), 213–224 (2005)CrossRefGoogle Scholar
  7. Devijver, P.A., Kittler, J.: Pattern Recognition: a Statistical Approach. Prentice Hall, Englewood Cliffs (1982)MATHGoogle Scholar
  8. Golub, T.R., et al.: Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring. Science 286, 531–537 (1999)CrossRefGoogle Scholar
  9. Glick, N.: Additive estimators for probabilities of correct classification. Pattern recognition 18(2), 151–159 (1985)CrossRefGoogle Scholar
  10. Gui, J., Li, H.: Penalized Cox regression analysis in the high-dimensional and low-sample size settings, with application to microarray gene expression data. Bioinformatics 21(13), 3001–3008 (2005)CrossRefGoogle Scholar
  11. Guyon, I., Weston, J., Barnhill, S.: Gene selection for cancer classification using support vector machines. Machine Learning 46, 389–422 (2002)MATHCrossRefGoogle Scholar
  12. Hall, M.A.: Correlation-based Feature Selection for Machine Learning, Ph.D. thesis, Department of Computer Science, Waikato University, New Zealand (1999)Google Scholar
  13. Han, J.W., Kamber, M.: Data mining: concepts and techniques. Morgan Kaufmann Publishers, San Francisco (2001)Google Scholar
  14. Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning, pp. 308–312. Springer, Heidelberg (2001)MATHGoogle Scholar
  15. Huang, D., Chow, T.W.S.: Efficiently searching the important input variables using Bayesian discriminant. IEEE Trans. Circuits and Systems 52(4), 785–793 (2005)CrossRefMathSciNetGoogle Scholar
  16. Huang, D., Chow, T.W.S., et al.: Efficient selection of salient features from microarray gene expression data for cancer diagnosis. IEEE Trans. Circuits and Systems, part I 52(9), 1909–1918 (2005)CrossRefMathSciNetGoogle Scholar
  17. Kim, S., Dougherty, E.R., Barrera, J.Y., et al.: Strong feature sets from small samples. Journal of Computational Biology 9, 127–146 (2002)CrossRefGoogle Scholar
  18. Lampariello, F., Sciandrone, M.: Efficient training of RBF neural networks for pattern recognition. IEEE Trans. On Neural Networks 12(5), 1235–1242 (2001)CrossRefGoogle Scholar
  19. Liu, H., Motoda, H.: Feature Selection for Knowledge Discovery and Data Mining. Kluwer Academic Publishers, London, GB (1998)MATHGoogle Scholar
  20. Matsuoka, S.: Noise injection into inputs in back-propagation learning. IEEE Trans. Syst., Man, Cybern. 22, 436–440 (1992)CrossRefGoogle Scholar
  21. Molina, L.C., Belanche, L., Nebot, A.: Feature Selection Algorithms: a Survey and Experimental Evaluation, Technical Report (2002), available at:
  22. Parzen, E.: On the estimation of a probability density function and mode. Ann. Math. Statistics 33, 1064–1076 (1962)CrossRefMathSciNetGoogle Scholar
  23. Perkins, S., Lacker, K., Theiler, J.: Grafting: Fast, Incremental feature selection by gradient descent in function space. Journal of machine learning research 3, 1333–1356 (2003)MATHCrossRefMathSciNetGoogle Scholar
  24. Pudil, P., Novovicova, J., Kittler, J.: Floating search methods in feature selection. Pattern Recognition Letter 15, 1119–1125 (1994)CrossRefGoogle Scholar
  25. Singh, D., et al.: Gene expression correlates of clinical prostate cancer behavior. Cancer Cell 1, 203–209 (2002)CrossRefGoogle Scholar
  26. Skurichina, M., Raudys, S., Duin, R.P.: K-nearest neighbours directed noise injection in multilayer perceptron training. IEEE Trans. On Neural Networks 11(2), 504–511 (2000)CrossRefGoogle Scholar
  27. Wolf, L., Martin, I.: Regularization through feature knock out, AI memo 2004-2005 (2004), available at
  28. Zagoruiko, N.G., Elkina, V.N., Temirkaev, V.S.: ZET-an algorithm of filling gaps in experimental data tables. Comput. Syst. 67, 3–28 (1976)Google Scholar
  29. Zhou, X., Wang, X., Dougherty, E.: Nonlinear probit gene classification using mutual information and wavelet-based feature selection. Journal of Biological Systems 12(3), 371–386 (2004)MATHCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • D. Huang
    • 1
  • Tommy W. S. Chow
    • 1
  1. 1.Department of Electrical EngineeringCity University of Hong Kong 

Personalised recommendations