Skip to main content

Advertisement

SpringerLink
Feature-subspace aggregating: ensembles for stable and unstable learners
Download PDF
Download PDF
  • Open Access
  • Published: 18 November 2010

Feature-subspace aggregating: ensembles for stable and unstable learners

  • Kai Ming Ting1,
  • Jonathan R. Wells1,
  • Swee Chuan Tan1,
  • Shyh Wei Teng1 &
  • …
  • Geoffrey I. Webb2 

Machine Learning volume 82, pages 375–397 (2011)Cite this article

  • 2550 Accesses

  • 33 Citations

  • 3 Altmetric

  • Metrics details

Abstract

This paper introduces a new ensemble approach, Feature-Subspace Aggregating (Feating), which builds local models instead of global models. Feating is a generic ensemble approach that can enhance the predictive performance of both stable and unstable learners. In contrast, most existing ensemble approaches can improve the predictive performance of unstable learners only. Our analysis shows that the new approach reduces the execution time to generate a model in an ensemble through an increased level of localisation in Feating. Our empirical evaluation shows that Feating performs significantly better than Boosting, Random Subspace and Bagging in terms of predictive accuracy, when a stable learner SVM is used as the base learner. The speed up achieved by Feating makes feasible SVM ensembles that would otherwise be infeasible for large data sets. When SVM is the preferred base learner, we show that Feating SVM performs better than Boosting decision trees and Random Forests. We further demonstrate that Feating also substantially reduces the error of another stable learner, k-nearest neighbour, and an unstable learner, decision tree.

Download to read the full article text

Working on a manuscript?

Avoid the most common mistakes and prepare your manuscript for journal editors.

Learn more

References

  • Asuncion, A., & Newman, D. J. (2007). UCI repository of machine learning databases. University of California, Irvine, CA.

  • Atkeson, C. G., Moore, A. W., & Schaal, S. (1997). Locally weighted learning. Artificial Intelligence Review, 11, 11–73.

    Article  Google Scholar 

  • Bingham, E., & Mannila, H. (2001). Random projection in dimensionality reduction: applications to image and text data. In Proceedings of the seventh ACM SIGKDD international conference on knowledge discovery and data mining (pp. 245–250). New York: ACM.

    Chapter  Google Scholar 

  • Breiman, L. (1996). Bagging predictors. Machine Learning, 24, 123–140.

    MathSciNet  MATH  Google Scholar 

  • Breiman, L. (1998). Arcing classifiers (with discussion). Annals of Statistics, 26(3), 801–849.

    Article  MathSciNet  MATH  Google Scholar 

  • Breiman, L. (2001). Random forests. Machine Learning, 45, 5–32

    Article  MATH  Google Scholar 

  • Cerquides, J., & Mantaras, R. L. D. (2005). Robust Bayesian linear classifier ensembles. In Proceedings of the sixteenth European conference on machine learning (pp. 70–81). Berlin: Springer.

    Google Scholar 

  • Davidson, I. (2004). An ensemble technique for stable learners with performance bounds. In Proceedings of the thirteenth national conference on artificial intelligence (pp. 330–335). Menlo Park: AAAI Press.

    Google Scholar 

  • DePasquale, J., & Polikar, O. (2007). Random feature subset selection for ensemble based classification of data with missing features. In Lecture notes in computer science: Vol. 4472. Multiple classifier systems (pp. 251–260). Berlin: Springer.

    Chapter  Google Scholar 

  • Fan, R.-E., Chang, K.-W., Hsieh, C.-J., Wang, X.-R., & Lin, C.-J. (2008). LIBLINEAR: a library for large linear classification. Journal of Machine Learning Research, 9, 1871–1874.

    Google Scholar 

  • Frank, E., Hall, M., & Pfahringer, B. (2003). Locally weighted Naive Bayes. In Proceedings of the 19th conference on uncertainty in artificial intelligence (pp. 249–256). San Mateo: Morgan Kaufmann.

    Google Scholar 

  • Ho, T. K. (1998). The random subspace method for constructing decision forests. IEEE Transactions Pattern Analysis and Machine Intelligence, 20(8), 832–844.

    Article  Google Scholar 

  • Kim, H.-C., Pang, S., Je, H.-M., Kim, D., & Bang, S.-Y. (2002). Support vector machine ensemble with bagging. In Lecture notes in computer science: Vol. 2388. Pattern recognition with support vector machines (pp. 131–141). Berlin: Springer.

    Google Scholar 

  • Klanke, S., Vijayakumar, S., & Schaal, S. (2008). A library for local weighted projection regression. Journal of Machine Learning Research, 9, 623–626.

    MathSciNet  Google Scholar 

  • Kohavi, R. (1996). Scaling up the accuracy of Naive-Bayes classifiers: a decision-tree hybrid. In Proceedings of the 2nd international conference on knowledge discovery and data mining (pp. 202–207). New York: ACM.

    Google Scholar 

  • Kohavi, R., & Li, C. H. (1995). Oblivious decision trees, graphs, and top-down pruning. In Proceedings of the 14th international joint conference on artificial intelligence (pp. 1071–1077). San Mateo: Morgan Kaufmann.

    Google Scholar 

  • Li, X., Wang, L., & Sung, E. (2005). A study of AdaBoost with SVM based weak learners. In Proceedings of the international joint conference on neural networks (pp. 196–201). New York: IEEE Press.

    Google Scholar 

  • Liu, F. T., Ting, K. M., Yu, Y., & Zhou, Z. H. (2008). Spectrum of variable-random trees. Journal of Artificial Intelligence Research, 32, 355–384.

    MATH  Google Scholar 

  • Opitz, D. (1999). Feature selection for ensembles. In Proceedings of the 16th national conference on artificial intelligence (pp. 379–384). Menlo Park: AAAI Press.

    Google Scholar 

  • Oza, N. C., & Tumer, K. (2001). Input decimation ensembles: decorrelation through dimensionality reduction. In LNCS: Vol. 2096. Proceedings of the second international workshop on multiple classifier systems (pp. 238–247). Berlin: Springer.

    Chapter  Google Scholar 

  • Pavlov, D., Mao, J., & Dom, B. (2000). Scaling-up support vector machines using the boosting algorithm. In Proceedings of the 15th international conference on pattern recognition (pp. 219–222). Los Alamitos: IEEE Comput. Soc.

    Chapter  Google Scholar 

  • Quinlan, J. R. (1993). C4.5: program for machine learning. San Mateo: Morgan Kaufmann.

    Google Scholar 

  • Schapire, R. E., & Singer, S. (1999). Improved boosting algorithms using confidence-rated predictions. Machine Learning, 37, 297–336.

    Article  MATH  Google Scholar 

  • Tao, D., Tang, X., Li, X., & Wu, X. (2006). Asymmetric bagging and random subspace for support vector machines-based relevance feedback in image retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(7), 1088–1099.

    Article  Google Scholar 

  • Tsang, I. W., Kwok, J. T., & Cheung, P.-M. (2005). Core vector machines: fast SVM training on very large data sets. Journal of Machine Learning Research, 6, 363–392.

    MathSciNet  Google Scholar 

  • Tsang, I. W., Kocsor, A., & Kwok, J. T. (2007). Simpler core vector machines with enclosing balls. In Proceedings of the twenty-fourth international conference on machine learning (pp. 911–918). San Mateo: Morgan Kaufmann.

    Google Scholar 

  • Webb, G. I., Boughton, J., & Wang, Z. (2005). Not so naive Bayes: averaged one-dependence estimators. Machine Learning, 58(1), 5–24.

    Article  MATH  Google Scholar 

  • Witten, I. H., & Frank, E. (2005). Data mining: practical machine learning tools and techniques (2nd edn.). San Mateo: Morgan Kaufmann.

    MATH  Google Scholar 

  • Yang, Y., Webb, G. I., Cerquides, J., Korb, K., Boughton, J., & Ting, K. M. (2007). To select or to weigh: a comparative study of linear combination schemes for superparent-one-dependence ensembles. IEEE Transaction on Knowledge and Data Engineering, 19(12), 1652–1665.

    Article  Google Scholar 

  • Yu, H.-F., Hsieh, C.-J., Chang, K.-W., & Lin, C.-J. (2010). Large linear classification when data cannot fit in memory. In Proceedings of the sixteenth ACM SIGKDD conference on knowledge discovery and data mining (pp. 833–842). New York: ACM.

    Chapter  Google Scholar 

  • Zheng, Z., & Webb, G. I. (2000). Lazy learning of Bayesian rules. Machine Learning, 41(1), 53–84.

    Article  Google Scholar 

  • Zheng, F., & Webb, G. I. (2006). Efficient lazy elimination for averaged one-dependence estimators. In Proceedings of the twenty-third international conference on machine learning (pp. 1113–1120). San Mateo: Morgan Kaufmann.

    Google Scholar 

Download references

Author information

Authors and Affiliations

  1. Gippsland School of Information Technology, Monash University, Vic, 3842, Australia

    Kai Ming Ting, Jonathan R. Wells, Swee Chuan Tan & Shyh Wei Teng

  2. Clayton School of Information Technology, Monash University, Vic, 3800, Australia

    Geoffrey I. Webb

Authors
  1. Kai Ming Ting
    View author publications

    You can also search for this author in PubMed Google Scholar

  2. Jonathan R. Wells
    View author publications

    You can also search for this author in PubMed Google Scholar

  3. Swee Chuan Tan
    View author publications

    You can also search for this author in PubMed Google Scholar

  4. Shyh Wei Teng
    View author publications

    You can also search for this author in PubMed Google Scholar

  5. Geoffrey I. Webb
    View author publications

    You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kai Ming Ting.

Additional information

Editor: Mark Craven.

Rights and permissions

Open Access This is an open access article distributed under the terms of the Creative Commons Attribution Noncommercial License (https://creativecommons.org/licenses/by-nc/2.0), which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.

Reprints and Permissions

About this article

Cite this article

Ting, K.M., Wells, J.R., Tan, S.C. et al. Feature-subspace aggregating: ensembles for stable and unstable learners. Mach Learn 82, 375–397 (2011). https://doi.org/10.1007/s10994-010-5224-5

Download citation

  • Received: 26 March 2009

  • Revised: 01 October 2010

  • Accepted: 17 October 2010

  • Published: 18 November 2010

  • Issue Date: March 2011

  • DOI: https://doi.org/10.1007/s10994-010-5224-5

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Keywords

  • Classifier ensembles
  • Stable learners
  • Unstable learners
  • Model diversity
  • Local models
  • Global models
Download PDF

Working on a manuscript?

Avoid the most common mistakes and prepare your manuscript for journal editors.

Learn more

Advertisement

Over 10 million scientific documents at your fingertips

Switch Edition
  • Academic Edition
  • Corporate Edition
  • Home
  • Impressum
  • Legal information
  • Privacy statement
  • California Privacy Statement
  • How we use cookies
  • Manage cookies/Do not sell my data
  • Accessibility
  • FAQ
  • Contact us
  • Affiliate program

Not affiliated

Springer Nature

© 2023 Springer Nature Switzerland AG. Part of Springer Nature.