Determining the Best Classification Algorithm with Recourse to Sampling and Metalearning

Brazdil, Pavel; Leite, Rui

doi:10.1007/978-3-642-05177-7_8

Pavel Brazdil⁵ &
Rui Leite⁵

Part of the book series: Studies in Computational Intelligence ((SCI,volume 262))

2212 Accesses
7 Citations

Abstract

Currently many classification algorithms exist and no algorithm exists that would outperform all the others. Therefore it is of interest to determine which classification algorithm is the best one for a given task. Although direct comparisons can be made for any given problem using a cross-validation evaluation, it is desirable to avoid this, as the computational costs are significant. We describe a method which relies on relatively fast pairwise comparisons involving two algorithms. This method is based on a previous work and exploits sampling landmarks, that is information about learning curves besides classical data characteristics. One key feature of this method is an iterative procedure for extending the series of experiments used to gather new information in the form of sampling landmarks. Metalearning plays also a vital role. The comparisons between various pairs of algorithm are repeated and the result is represented in the form of a partially ordered ranking. Evaluation is done by comparing the partial order of algorithm that has been predicted to the partial order representing the supposedly correct result. The results of our analysis show that the method has good performance and could be of help in practical applications.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Asuncion, A., Newman, D.J.: UCI machine learning repository (2007)
Google Scholar
Brazdil, P., Giraud-Carrier, C., Soares, C., Vilalta, R.: Metalearning: Applications to Data Mining. Springer, Heidelberg (2009)
MATH Google Scholar
Brazdil, P., Soares, C., Costa, J.: Ranking learning algorithms: Using IBL and meta-learning on accuracy and time results. Machine Learning 50, 251–277 (2003)
Article MATH Google Scholar
Cohen, W.W.: Fast effective rule induction. In: Prieditis, A., Russell, S. (eds.) Proc. of the 12th International Conference on Machine Learning, Tahoe City, CA, July 9-12, pp. 115–123. Morgan Kaufmann, San Francisco (1995)
Google Scholar
Cook, W.D., Kress, M., Seiford, L.W.: A general framework for distance-based consensus in ordinal ranking models. European Journal of Operational Research 96(2), 392–397 (1996)
Article Google Scholar
Costa, J.P., Soares, C.: A weighted rank measure of correlation. Australian and New Zealand Journal of Statistics 47(4), 515–529 (2005)
Article MATH MathSciNet Google Scholar
Fürnkranz, J., Petrak, J.: An evaluation of landmarking variants. In: Proceedings of the ECML/PKDD Workshop on Integrating Aspects of Data Mining, Decision Support and Meta-Learning (IDDM 2001), pp. 57–68. Springer, Heidelberg (2001)
Google Scholar
le Cessie, S., van Houwelingen, J.C.: Ridge estimators in logistic regression. Applied Statistics 41(1), 191–201 (1992)
Article MATH Google Scholar
Leite, R., Brazdil, P.: Predicting relative performance of classifiers from samples. In: ICML 2005: Proceedings of the 22nd international conference on Machine learning, pp. 497–503. ACM Press, New York (2005)
Chapter Google Scholar
Leite, R., Brazdil, P.: An iterative process for building learning curves and predicting relative performance of classifiers. In: Neves, J., Santos, M.F., Machado, J.M. (eds.) EPIA 2007. LNCS (LNAI), vol. 4874, pp. 87–98. Springer, Heidelberg (2007)
Chapter Google Scholar
Leite, R., Brazdil, P.: An iterative process of building learning curves and predicting relative performance of classifiers. In: Brazdil, P., Bernstein, A. (eds.) Proceedings of the Planning to Learn Workshop (PlanLearn 2007), held at ECML/ PKDD 2007, pp. 31–40 (2007)
Google Scholar
Leite, R., Brazdil, P.: Selecting classifiers using metalearning with sampling landmarks and data characterization. In: Brazdil, P., Bernstein, A., Hunter, L. (eds.) Proceedings of the Planning to Learn Workshop (PlanLearn 2008), held at ICML/COLT/UAI 2008, Helsinki, Finland, pp. 35–41 (2008)
Google Scholar
Ler, D., Koprinska, I., Chawla, S.: A new landmarker generation based on correlativity. In: Proceedings of the IEEE International Conference on Machine Learning and Applications, pp. 178–185. IEEE Press, Louisville (2004)
Chapter Google Scholar
Ler, D., Koprinska, I., Chawla, S.: Utilizing regression-based landmarkers within a meta-learning framework for algorithm selection. In: Proceedings of the Workshop on Meta-Learning, associated with 22nd International Conference on Machine Learning, Bonn, Germany, pp. 44–51 (2005)
Google Scholar
Metal project site (1999), http://www.metal-kdd.org/
Michie, D., Spiegelhalter, D.J., Taylor, C.C.: Machine Learning, Neural and Statistical Classification. Ellis Horwood (1994)
Google Scholar
Russell, S.J., Norvig, P.: Artificial Intelligence: A Modern Approach. Pearson Education, London (2003)
Google Scholar
Soares, C.: Learning Rankings of Learning Algorithms. PhD thesis, Department of Computer Science, Faculty of Sciences, University of Porto (2004)
Google Scholar
Soares, C., Petrak, J., Brazdil, P.: Sampling-based relative landmarks: Systematically test-driving algorithms before choosing. In: Brazdil, P.B., Jorge, A.M. (eds.) EPIA 2001. LNCS (LNAI), vol. 2258, pp. 88–94. Springer, Heidelberg (2001)
Google Scholar
Witten, I., Frank, E., Trigg, L., Hall, M., Holmes, G., Cunningham, S.: Weka: Practical machine learning tools and techniques with Java implementations (1999)
Google Scholar

Download references

Author information

Authors and Affiliations

LIAAD-INESC Porto L.A./Faculty of Economics, University of Porto, Rua de Ceuta, 118-6, 4050-190, Porto, Portugal
Pavel Brazdil & Rui Leite

Authors

Pavel Brazdil
View author publications
You can also search for this author in PubMed Google Scholar
Rui Leite
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Institute of Computer Science, Polish Academy of Sciences, ul.Ordona 21, 01-237, Warsaw, Poland
Jacek Koronacki & Sławomir T. Wierzchoń &
Woodward Hall 430C University of North Carolina, 9201 University City Blvd., N.C. 28223, Charlotte, USA
Zbigniew W. Raś
Systems Research Institute, Polish Academy of Sciences, ul.Newelska 6, 01-447, Warsaw, 01-447
Janusz Kacprzyk

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Brazdil, P., Leite, R. (2010). Determining the Best Classification Algorithm with Recourse to Sampling and Metalearning. In: Koronacki, J., Raś, Z.W., Wierzchoń, S.T., Kacprzyk, J. (eds) Advances in Machine Learning I. Studies in Computational Intelligence, vol 262. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-05177-7_8

Download citation

DOI: https://doi.org/10.1007/978-3-642-05177-7_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-05176-0
Online ISBN: 978-3-642-05177-7
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics