Feature Selection by Markov Chain Monte Carlo Sampling – A Bayesian Approach

  • Michael Egmont-Petersen
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3138)

Abstract

We redefine the problem of feature selection as one of model selection and propose to use a Markov Chain Monte Carlo method to sample models. The applicability of our method is related to Bayesian network classifiers. Simulation experiments indicate that our novel proposal distribution results in an ignorant proposal prior. Finally, it is shown how the sampling can be controlled by a regularization prior.

References

  1. 1.
    Foroutan, I., Sklansky, J.: Feature selection for automatic classification of non-gaussian data. IEEE Transactions on Systems, Man, and Cybernetics 17, 187–198 (1987)CrossRefGoogle Scholar
  2. 2.
    Kittler, J.: Computational problems of feature selection pertaining to large data sets. Proceedings of Pattern Recognition in Practice, 405–414 (1980)Google Scholar
  3. 3.
    Jain, A., Zongker, D.: Feature selection: Evaluation, application, and small sample performance. IEEE transactions on pattern analysis and machine intelligence 19, 153–158 (1997)CrossRefGoogle Scholar
  4. 4.
    Kudo, M., Sklansky, J.: Comparison of algorithms that select features for pattern classifiers. Pattern recognition 33, 25–41 (2000)CrossRefGoogle Scholar
  5. 5.
    Waller, W.G., Jain, A.K.: On the monotonicity of the performance of a bayesian classifier. IEEE Transactions on Information Theory 24, 120–126 (1978)CrossRefMathSciNetGoogle Scholar
  6. 6.
    Egmont-Petersen, M., Talmon, J., Hasman, A., Ambergen, A.: Assessing the importance of features for multi-layer perceptrons. Neural networks 11, 623–635 (1998)CrossRefGoogle Scholar
  7. 7.
    Giudici, P., Castelo, R.: Improving markov chain monte carlo model search for data mining. Machine learning 50, 127–158 (2003)MATHCrossRefGoogle Scholar
  8. 8.
    Madigan, D., York, J.: Bayesian graphical models for discrete-data. International statistical review 63, 215–232 (1995)MATHCrossRefGoogle Scholar
  9. 9.
    Chandrasekaran, B.: Independence of measurements and the mean recognition accuracy. IEEE Transactions of Information Theory 17, 452–456 (2002)CrossRefMathSciNetGoogle Scholar
  10. 10.
    Trunk, G.: A problem of dimensionality: a simple example. IEEE Transactions of Pattern Analysis and Machine Intelligence 1, 306–307 (1979)CrossRefGoogle Scholar
  11. 11.
    Forsythe, A., Engleman, L., Jennrich, R., May, P.R.A.: A stopping rule for variable selection in multiple regression. Journal of the American Statistical Association 68, 75–77 (1973)MATHCrossRefGoogle Scholar
  12. 12.
    McLachlan, G.J.: Discriminant analysis and Statistical Pattern Recognition. John Wiley & Sons, New York (1992)CrossRefGoogle Scholar
  13. 13.
    Siedlecki, W., Sklansky, J.: On automatic feature selection. Journal of Pattern Recognition and Artificial Intelligence 2, 197–220 (1988)CrossRefGoogle Scholar
  14. 14.
    Siedlecki, W., Sklansky, J.: A note on genetic algorithms for large-scale feature selection. Pattern Recognition Letters 10, 335–347 (1989)MATHCrossRefGoogle Scholar
  15. 15.
    Cooper, G.F., Herskovits, E.: A bayesian method for the induction of probabilistic networks from data. Machine learning 9, 309–347 (1992)MATHGoogle Scholar
  16. 16.
    Chib, S., Greenberg, E.: Understanding the metropolis-hastings algorithm. American statistician 49, 327–335 (1995)CrossRefGoogle Scholar
  17. 17.
    Baesens, B., Egmont-Petersen, M., Castelo, R., Vanthienen, J.: Learning bayesian network classifiers for credit scoring using markov chain monte carlo search. In: Proceedings of the International Conference on Pattern Recognition, Piscataway, pp. 49–52. IEEE Computer Society, Los Alamitos (2002)Google Scholar
  18. 18.
    Friedman, N., Geiger, D., Goldszmidt, M.: Bayesian network classifiers. Machine learning 29, 131–163 (1997)MATHCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2004

Authors and Affiliations

  • Michael Egmont-Petersen
    • 1
  1. 1.Institute of Information and Computing SciencesUtrecht UniversityDe UithofThe Netherlands

Personalised recommendations