Skip to main content

Determining the Best Classification Algorithm with Recourse to Sampling and Metalearning

  • Chapter
Advances in Machine Learning I

Part of the book series: Studies in Computational Intelligence ((SCI,volume 262))

Abstract

Currently many classification algorithms exist and no algorithm exists that would outperform all the others. Therefore it is of interest to determine which classification algorithm is the best one for a given task. Although direct comparisons can be made for any given problem using a cross-validation evaluation, it is desirable to avoid this, as the computational costs are significant. We describe a method which relies on relatively fast pairwise comparisons involving two algorithms. This method is based on a previous work and exploits sampling landmarks, that is information about learning curves besides classical data characteristics. One key feature of this method is an iterative procedure for extending the series of experiments used to gather new information in the form of sampling landmarks. Metalearning plays also a vital role. The comparisons between various pairs of algorithm are repeated and the result is represented in the form of a partially ordered ranking. Evaluation is done by comparing the partial order of algorithm that has been predicted to the partial order representing the supposedly correct result. The results of our analysis show that the method has good performance and could be of help in practical applications.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Asuncion, A., Newman, D.J.: UCI machine learning repository (2007)

    Google Scholar 

  2. Brazdil, P., Giraud-Carrier, C., Soares, C., Vilalta, R.: Metalearning: Applications to Data Mining. Springer, Heidelberg (2009)

    MATH  Google Scholar 

  3. Brazdil, P., Soares, C., Costa, J.: Ranking learning algorithms: Using IBL and meta-learning on accuracy and time results. Machine Learning 50, 251–277 (2003)

    Article  MATH  Google Scholar 

  4. Cohen, W.W.: Fast effective rule induction. In: Prieditis, A., Russell, S. (eds.) Proc. of the 12th International Conference on Machine Learning, Tahoe City, CA, July 9-12, pp. 115–123. Morgan Kaufmann, San Francisco (1995)

    Google Scholar 

  5. Cook, W.D., Kress, M., Seiford, L.W.: A general framework for distance-based consensus in ordinal ranking models. European Journal of Operational Research 96(2), 392–397 (1996)

    Article  Google Scholar 

  6. Costa, J.P., Soares, C.: A weighted rank measure of correlation. Australian and New Zealand Journal of Statistics 47(4), 515–529 (2005)

    Article  MATH  MathSciNet  Google Scholar 

  7. Fürnkranz, J., Petrak, J.: An evaluation of landmarking variants. In: Proceedings of the ECML/PKDD Workshop on Integrating Aspects of Data Mining, Decision Support and Meta-Learning (IDDM 2001), pp. 57–68. Springer, Heidelberg (2001)

    Google Scholar 

  8. le Cessie, S., van Houwelingen, J.C.: Ridge estimators in logistic regression. Applied Statistics 41(1), 191–201 (1992)

    Article  MATH  Google Scholar 

  9. Leite, R., Brazdil, P.: Predicting relative performance of classifiers from samples. In: ICML 2005: Proceedings of the 22nd international conference on Machine learning, pp. 497–503. ACM Press, New York (2005)

    Chapter  Google Scholar 

  10. Leite, R., Brazdil, P.: An iterative process for building learning curves and predicting relative performance of classifiers. In: Neves, J., Santos, M.F., Machado, J.M. (eds.) EPIA 2007. LNCS (LNAI), vol. 4874, pp. 87–98. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  11. Leite, R., Brazdil, P.: An iterative process of building learning curves and predicting relative performance of classifiers. In: Brazdil, P., Bernstein, A. (eds.) Proceedings of the Planning to Learn Workshop (PlanLearn 2007), held at ECML/ PKDD 2007, pp. 31–40 (2007)

    Google Scholar 

  12. Leite, R., Brazdil, P.: Selecting classifiers using metalearning with sampling landmarks and data characterization. In: Brazdil, P., Bernstein, A., Hunter, L. (eds.) Proceedings of the Planning to Learn Workshop (PlanLearn 2008), held at ICML/COLT/UAI 2008, Helsinki, Finland, pp. 35–41 (2008)

    Google Scholar 

  13. Ler, D., Koprinska, I., Chawla, S.: A new landmarker generation based on correlativity. In: Proceedings of the IEEE International Conference on Machine Learning and Applications, pp. 178–185. IEEE Press, Louisville (2004)

    Chapter  Google Scholar 

  14. Ler, D., Koprinska, I., Chawla, S.: Utilizing regression-based landmarkers within a meta-learning framework for algorithm selection. In: Proceedings of the Workshop on Meta-Learning, associated with 22nd International Conference on Machine Learning, Bonn, Germany, pp. 44–51 (2005)

    Google Scholar 

  15. Metal project site (1999), http://www.metal-kdd.org/

  16. Michie, D., Spiegelhalter, D.J., Taylor, C.C.: Machine Learning, Neural and Statistical Classification. Ellis Horwood (1994)

    Google Scholar 

  17. Russell, S.J., Norvig, P.: Artificial Intelligence: A Modern Approach. Pearson Education, London (2003)

    Google Scholar 

  18. Soares, C.: Learning Rankings of Learning Algorithms. PhD thesis, Department of Computer Science, Faculty of Sciences, University of Porto (2004)

    Google Scholar 

  19. Soares, C., Petrak, J., Brazdil, P.: Sampling-based relative landmarks: Systematically test-driving algorithms before choosing. In: Brazdil, P.B., Jorge, A.M. (eds.) EPIA 2001. LNCS (LNAI), vol. 2258, pp. 88–94. Springer, Heidelberg (2001)

    Google Scholar 

  20. Witten, I., Frank, E., Trigg, L., Hall, M., Holmes, G., Cunningham, S.: Weka: Practical machine learning tools and techniques with Java implementations (1999)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Brazdil, P., Leite, R. (2010). Determining the Best Classification Algorithm with Recourse to Sampling and Metalearning. In: Koronacki, J., Raś, Z.W., Wierzchoń, S.T., Kacprzyk, J. (eds) Advances in Machine Learning I. Studies in Computational Intelligence, vol 262. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-05177-7_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-05177-7_8

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-05176-0

  • Online ISBN: 978-3-642-05177-7

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics