Virtual Attribute Subsetting

  • Michael Horton
  • Mike Cameron-Jones
  • Ray Williams
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4304)

Abstract

Attribute subsetting is a meta-classification technique, based on learning multiple base-level classifiers on projections of the training data. In prior work with nearest-neighbour base classifiers, attribute subsetting was modified to learn only one classifier, then to selectively ignore attributes at classification time to generate multiple predictions. In this paper, the approach is generalized to any type of base classifier. This ‘virtual attribute subsetting’ requires a fast subset choice algorithm; one such algorithm is found and described. In tests with three different base classifier types, virtual attribute subsetting is shown to yield some or all of the benefits of standard attribute subsetting while reducing training time and storage requirements.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Breiman, L.: Bagging Predictors. Machine Learning 24, 123–140 (1996)MATHMathSciNetGoogle Scholar
  2. 2.
    Freund, Y., Schapire, R.E.: A Short Introduction to Boosting. Journal of the Japanese Society for Artificial Intelligence 14, 771–780 (1999)Google Scholar
  3. 3.
    Wolpert, D.H.: Stacked Generalization. Neural Networks 5, 241–259 (1992)CrossRefGoogle Scholar
  4. 4.
    Bay, S.D.: Combining Nearest Neighbor Classifiers Through Multiple Feature Subsets. In: Proceedings of the 15th International Conference on Machine Learning, pp. 37–45. Morgan Kaufmann, San Francisco (1998)Google Scholar
  5. 5.
    Ho, T.K.: The random subspace method for constructing decision forests. IEEE Transactions on Pattern Analysis and Machine Intelligence 20, 832–844 (1998)CrossRefGoogle Scholar
  6. 6.
    Bryll, R., Gutierrez-Osuna, R., Quek, F.: Attribute bagging: improving accuracy of classifier ensembles by using random feature subsets. Pattern Recognition 36, 1291–1302 (2003)MATHCrossRefGoogle Scholar
  7. 7.
    Sutton, C., Sindelar, M., McCallum, A.: Feature Bagging: Preventing Weight Undertraining in Structured Discriminative Learning. In: CIIR Technical Report IR-402 (2005)Google Scholar
  8. 8.
    Cherkauer, K.J.: Human expert-level performance on a scientific image analysis task by a system using combined artificial neural networks. In: Chan, P. (ed.) Working Notes of the AAAI Workshop on Integrating Multiple Learned Models, pp. 15–21. AAAI Press, Menlo Park (1996)Google Scholar
  9. 9.
    Kohavi, R., Becker, B., Sommerfield, D.: Improving Simple Bayes. In: Someren, M.v., Widmer, G. (eds.) 9th European Conference on Machine Learning, Springer, Heidelberg (1997)Google Scholar
  10. 10.
    Quinlan, J.R.: Unknown Attribute Values in Induction. In: Segre, A.M. (ed.) Proceedings of the 6th International Workshop on Machine Learning, pp. 164–168. Morgan Kaufmann, San Francisco (1989)Google Scholar
  11. 11.
    Quinlan, J.R.: Unknown Attribute Values. In: C4.5: Programs for Machine Learning, p. 30. Morgan Kaufmann, San Francisco (1993)Google Scholar
  12. 12.
    Cohen, W.W.: Fast Effective Rule Induction. In: Prieditis, A., Russell, S.J. (eds.) Proceedings of the 12th International Conference on Machine Learning, pp. 115–123. Morgan Kaufmann, San Francisco (1995)Google Scholar
  13. 13.
    Frank, E., Witten, I.H.: Generating Accurate Rule Sets Without Global Optimization. In: Proceedings of the 15th International Conference on Machine Learning, pp. 144–151. Morgan Kaufmann, San Francisco (1998)Google Scholar
  14. 14.
    Witten, I.H., Frank, E.: Data Mining: Practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)MATHGoogle Scholar
  15. 15.
    Newman, D.J., Hettich, S., Blake, C.L., Merz, C.J.: UCI Repository of machine learning databases (1998), http://www.ics.uci.edu/~mlearn/MLRepository.html
  16. 16.
    Weka 3 - Data Mining with Open Source Machine Learning Software in Java - Collections of datasets, http://www.cs.waikato.ac.nz/~ml/weka/index_datasets.html

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Michael Horton
    • 1
  • Mike Cameron-Jones
    • 1
  • Ray Williams
    • 1
  1. 1.School of ComputingUniversity of TasmaniaAustralia

Personalised recommendations