Advertisement

Bootstrap Feature Selection for Ensemble Classifiers

  • Rakkrit Duangsoithong
  • Terry Windeatt
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6171)

Abstract

Small number of samples with high dimensional feature space leads to degradation of classifier performance for machine learning, statistics and data mining systems. This paper presents a bootstrap feature selection for ensemble classifiers to deal with this problem and compares with traditional feature selection for ensemble (select optimal features from whole dataset before bootstrap selected data). Four base classifiers: Multilayer Perceptron, Support Vector Machines, Naive Bayes and Decision Tree are used to evaluate the performance of UCI machine learning repository and causal discovery datasets. Bootstrap feature selection algorithm provides slightly better accuracy than traditional feature selection for ensemble classifiers.

Keywords

Bootstrap feature selection ensemble classifiers 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Bellman, R.E.: Adaptive Control Processes: A Guided Tour. Princeton University Press, Princeton (1961)MATHGoogle Scholar
  2. 2.
    Liu, H., Yu, L.: Toward integrating feature selection algorithms for classification and clustering. IEEE Transactions on Knowledge and Data Engineering 17(4), 491–502 (2005)CrossRefGoogle Scholar
  3. 3.
    Saeys, Y., Inza, I., Larranaga, P.: A review of feature selection techniques in bioinformatics. Bioinformatics 23(19), 2507–2517 (2007)CrossRefGoogle Scholar
  4. 4.
    Duangsoithong, R., Windeatt, T.: Relevance and Redundancy Analysis for Ensemble Classifiers. In: Perner, P. (ed.) Machine Learning and Data Mining in Pattern Recognition, vol. 5632, pp. 206–220. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  5. 5.
    Windeatt, T.: Ensemble MLP Classifier Design, vol. 137, pp. 133–147. Springer, Heidelberg (2008)Google Scholar
  6. 6.
    Windeatt, T.: Accuracy/diversity and ensemble MLP classifier design. IEEE Transactions on Neural Networks 17(5), 1194–1211 (2006)CrossRefGoogle Scholar
  7. 7.
    Witten, I.H., Frank, E.: Data Mining Practical Machine Learning Tools and Techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)MATHGoogle Scholar
  8. 8.
    Almuallim, H., Dietterich, T.G.: Learning with many irrelevant features. In: Proceedings of the Ninth National Conference on Artificial Intelligence, pp. 547–552. AAAI Press, Menlo Park (1991)Google Scholar
  9. 9.
    Hall, M.A.: Correlation-based feature selection for discrete and numeric class machine learning. In: Proceeding of the 17th International Conference on Machine Learning, pp. 359–366. Morgan Kaufmann, San Francisco (2000)Google Scholar
  10. 10.
    Yu, L., Liu, H.: Efficient feature selection via analysis of relevance and redundancy. J. Mach. Learn. Res. 5, 1205–1224 (2004)MathSciNetGoogle Scholar
  11. 11.
    Deisy, C., Subbulakshmi, B., Baskar, S., Ramaraj, N.: Efficient dimensionality reduction approaches for feature selection. In: International Conference on Computational Intelligence and Multimedia Applications, vol. 2, pp. 121–127 (2007)Google Scholar
  12. 12.
    Chou, T., Yen, K., Luo, J., Pissinou, N., Makki, K.: Correlation-based feature selection for intrusion detection design. In: IEEE on Military Communications Conference, MILCOM 2007, pp. 1–7 (2007)Google Scholar
  13. 13.
    Ho, T.K.: The random subspace method for constructing decision forests. IEEE Transactions on Pattern Analysis and Machine Intelligence 20(8), 832–844 (1998)CrossRefGoogle Scholar
  14. 14.
    Oza, N.C., Tumer, K.: Input decimation ensembles: Decorrelation through dimensionality reduction. In: Proceeding of the 2nd International Workshop on Multiple Classier Systems, pp. 238–247. Springer, Heidelberg (2001)Google Scholar
  15. 15.
    Bryll, R.K., Osuna, R.G., Quek, F.K.H.: Attribute bagging: improving accuracy of classifier ensembles by using random feature subsets. Pattern Recognition 36(6), 1291–1302 (2003)MATHCrossRefGoogle Scholar
  16. 16.
    Skurichina, M., Duin, R.P.W.: Combining feature subsets in feature selection. In: Oza, N.C., Polikar, R., Kittler, J., Roli, F. (eds.) MCS 2005. LNCS, vol. 3541, pp. 165–175. Springer, Heidelberg (2005)Google Scholar
  17. 17.
    Opitz, D.W.: Feature Selection for Ensembles. In: AAAI 1999: Proceedings of the 16th National Conference on Artificial Intelligence, pp. 379–384. American Association for Artificial Intelligence, Menlo Park (1999)Google Scholar
  18. 18.
    Li, G.Z., Meng, H.H., Lu, W.C., Yang, J., Yang, M.: Asymmetric bagging and feature selection for activities prediction of drug molecules. Journal of BMC Bioinformatics 9, 1471–2105 (2008)Google Scholar
  19. 19.
    Munson, M.A., Caruana, R.: On Feature Selection, Bias-Variance, and Bagging. In: Buntine, W., Grobelnik, M., Mladenić, D., Shawe-Taylor, J. (eds.) ECML PKDD 2009, Part II. LNCS (LNAI), vol. 5782, pp. 144–159. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  20. 20.
    Tuv, E., Borisov, A., Runger, G., Torkkila, K.: Feature Selection with Ensembles, Artificial Variables, and Redundancy Elimination. Journal of Machine Learning Research 10, 1341–1366 (2009)Google Scholar
  21. 21.
    Saeys, Y., Abeel, T., Van de Peer, Y.: Robust Feature Selection Using Ensemble Feature Selection Techniques. In: Daelemans, W., Goethals, B., Morik, K. (eds.) ECML PKDD 2008, Part II. LNCS (LNAI), vol. 5212, pp. 313–325. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  22. 22.
    Gulgezen, G., Cataltepe, Z., Yu, L.: Stable and Accurate Feature Selection. In: Buntine, W., Grobelnik, M., Mladenić, D., Shawe-Taylor, J. (eds.) ECML PKDD 2009. LNCS (LNAI), vol. 5781, pp. 455–468. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  23. 23.
    Pudil, P., Novovicova, J., Kitler, J.: Floating Search Methods in Feature Selection. Pattern Recognition Letters 15, 1119–1125 (1994)CrossRefGoogle Scholar
  24. 24.
    Breiman, L.: Bagging predictors. Machine Learning 24(2), 123–140 (1996)MATHMathSciNetGoogle Scholar
  25. 25.
    Asuncion, A., Newman, D.: UCI machine learning repository (2007), http://www.ics.uci.edu/mlearn/MLRepository.html
  26. 26.
    Guyon, I.: Causality Workbench (2008), http://www.causality.inf.ethz.ch/home.php

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Rakkrit Duangsoithong
    • 1
  • Terry Windeatt
    • 1
  1. 1.Center for Vision, Speech and Signal ProcessingUniversity of SurreyGuildfordUnited Kingdom

Personalised recommendations