Abstract
We present a meta-learning method to support selection of candidate learning algorithms. It uses a k-Nearest Neighbor algorithm to identify the datasets that are most similar to the one at hand. The distance between datasets is assessed using a relatively small set of data characteristics, which was selected to represent properties that affect algorithm performance. The performance of the candidate algorithms on those datasets is used to generate a recommendation to the user in the form of a ranking. The performance is assessed using a multicriteria evaluation measure that takes not only accuracy, but also time into account. As it is not common in Machine Learning to work with rankings, we had to identify and adapt existing statistical techniques to devise an appropriate evaluation methodology. Using that methodology, we show that the meta-learning method presented leads to significantly better rankings than the baseline ranking method. The evaluation methodology is general and can be adapted to other ranking problems. Although here we have concentrated on ranking classification algorithms, the meta-learning framework presented can provide assistance in the selection of combinations of methods or more complex problem solving strategies.
Article PDF
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
References
Aha, D. (1992). Generalizing from case studies: A case study. In D. Sleeman & P. Edwards (Eds.), Proceedings of the Ninth International Workshop on Machine Learning (ML92) (pp. 1–10). San Mateo, CA: Morgan Kaufmann.
Atkeson, C. G., Moore, A. W., & Schaal, S. (1997). Locally Weighted Learning (Vol. 11) (pp. 11–74). Boston: Kluwer.
Bensusan, H., & Giraud-Carrier, C. (2000). If you see la sagrada familia, you know where you are: Landmarking the learner space. Technical report, Department of Computer Science, University of Bristol.
Bensusan, H., & Kalousis, A. (2001). Estimating the predictive accuracy of a classifier. In P. Flach & L. de Raedt (Eds.), Proceedings of the 12th European Conference on Machine Learning (pp. 25–36). New York: Springer.
Bernstein, A., & Provost, F. (2001). An intelligent assistant for the knowledge discovery process. In W. Hsu, H. Kargupta, H. Liu, & N. Street (Eds.), Proceedings of the IJCAI-01 Workshop on Wrappers for Performance Enhancement in KDD.
Berrer, H., Paterson, I., & Keller, J. (2000). Evaluation of machine-learning algorithm ranking advisors. In P. Brazdil & A. Jorge (Eds.), Proceedings of the PKDD2000 Workshop on Data Mining, Decision Support, Meta-Learning and ILP: Forum for Practical Problem Presentation and Prospective Solutions (pp. 1–13).
Blake, C., Keogh, E., & Merz, C. (1998). Repository of machine learning databases. Available at http:/www. ics.uci.edu/~mlearn/MLRepository.html
Brachman, R., & Anand, T. (1996). The process of knowledge discovery in databases. In U. Fayyad, G. Piatetsky-Shapiro, P. Smyth, & R. Uthurusamy (Eds.), Advances in Knowledge Discovery and Data Mining, ch. 2 (pp. 37–57). AAAI Press/The MIT Press.
Brazdil, P., Gama, J., & Henery, B. (1994). Characterizing the applicability of classification algorithms using metalevel learning. In F. Bergadano & L. de Raedt (Eds.), Proceedings of the European Conference on Machine Learning (ECML-94) (pp. 83–102). Berlin: Springer-Verlag.
Brazdil, P., & Soares, C. (2000). A comparison of ranking methods for classification algorithm selection. In R. de Mántaras & E. Plaza (Eds.), Machine Learning: Proceedings of the 11th European Conference on Machine Learning ECML2000 (pp. 63–74). Berlin: Springer.
Brazdil, P., Soares, C., & Pereira, R. (2001). Reducing rankings of classifiers by eliminating redundant cases. In P. Brazdil & A. Jorge (Eds.), Proceedings of the 10th Portuguese Conference on Artificial Intelligence (EPIA 2001). New York: Springer.
Brodley, C. (1993). Addressing the selective superiority problem: Automatic algorithm/model class selection. In P. Utgoff (Ed.), Proceedings of the Tenth International Conference on Machine Learning (pp. 17–24). San Mateo, CA: Morgan Kaufmann.
Charnes, A., Cooper, W., & Rhodes, E. (1978). Measuring the efficiency of decision making units. European Journal of Operational Research, 2, 429–444.
Cohen, W. (1995). Fast effective rule induction. In A. Prieditis & S. Russell (Eds.), Proceedings of the 11th International Conference on Machine Learning (pp. 115–123). San Mateo, CA: Morgan Kaufmann.
Fürnkranz, J., & Petrak, J. (2001). An evaluation of landmarking variants. In C. Giraud-Carrier, N. Lavrac, & S. Moyle (Eds.), Working Notes of the ECML/PKDD 2000 Workshop on Integrating Aspects of Data Mining, Decision Support and Meta-Learning (pp. 57–68).
Gama, J. (1997). Probabilistic linear tree. In D. Fisher (Ed.), Proceedings of the 14th International Machine Learning Conference (ICML97) (pp. 134–142). San Mateo, CA: Morgan Kaufmann.
Gama, J., & Brazdil, P. (1995). Characterization of classification algorithms. In C. Pinto-Ferreira & N. Mamede (Eds.), Progress in Artificial Intelligence (pp. 189–200). Berlin: Springer-Verlag.
Henery, R. (1994). Methods for comparison. In D. Michie, D. Spiegelhalter, & C. Taylor (Eds.), Machine Learning, Neural and Statistical Classification, ch. 7 (pp. 107–124). Ellis Horwood.
Hilario, M., & Kalousis, A. (1999). Building algorithm profiles for prior model selection in knowledge discovery systems. In Proceedings of the IEEE SMC'99 International Conference on Systems, Man and Cybernetics. New York: IEEE Press.
Hilario, M., & Kalousis, A. (2001). Fusion of meta-knowledge and meta-data for case-based model selection. In A. Siebes & L. de Raedt (Eds.), Proceedings of the Fifth European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD01). New York: Springer.
Hochreiter, S., Younger, A., & Conwell, P. (2001). Learning to learn using gradient descent. In G. Dorffner, H. Bischof, & K. Hornik (Eds.), Lecture Notes on Comp. Sci. 2130, Proc. Intl. Conf. On Artificial Neural Networks (ICANN-2001) (pp. 87–94). New York: Springer.
Kalousis, A., & Hilario, M. (2000). A comparison of inducer selection via instance-based and boosted decision tree meta-learning. In R. Michalski & P. Brazdil (Eds.), Proceedings of the Fifth International Workshop on Multistrategy Learning (pp. 233–247).
Kalousis, A., & Theoharis, T. (1999).NOEMON:Design, implementation and performance results of an intelligent assistant for classifier selection. Intelligent Data Analysis, 3:5, 319–337.
Keller, J., Paterson, I., & Berrer, H. (2000). An integrated concept for multi-criteria ranking of data-mining algorithms. In J. Keller & C. Giraud-Carrier (Eds.), Meta-Learning: Building Automatic Advice Strategies for Model Selection and Method Combination.
Kohavi, R., John, G., Long, R., Mangley, D., & Pfleger, K. (1994). MLC++: A machine learning library in c++. Technical report, Stanford University.
Lindner, G., & Studer, R. (1999). AST: Support for algorithm selection with a CBR approach. In C. Giraud-Carrier & B. Pfahringer (Eds.), Recent Advances in Meta-Learning and Future Work (pp. 38–47). J. Stefan Institute. Available at http://ftp.cs.bris.ac.uk/cgc/ICML99/lindner.ps.Z
Maron, O., & Moore, A. (1994). Hoeffding races: Accelerating model selection search for classification and function approximation. In J. Cowan, G. Tesauro, & J. Alspector (Eds.), Advances in Neural Information Processing Systems (pp. 59–66). San Mateo, CA: Morgan Kaufmann.
METAL Consortium (2002). Esprit project METAL (#26.357). Available at www.metal-kdd.org.
Michie, D., Spiegelhalter, D., & Taylor, C. (1994). Machine Learning, Neural and Statistical Classification. Ellis Horwood.
Mitchell, T. (1997). Machine Learning. New York: McGraw-Hill.
Nakhaeizadeh, G., & Schnabl, A. (1997). Towards the personalization of algorithms evaluation in data mining. In R. Agrawal & P. Stolorz (Eds.), Proceedings of the Third International Conference on Knowledge Discovery & Data Mining (pp. 289–293). AAAI Press.
Nakhaeizadeh, G., & Schnabl, A. (1998). Development of multi-criteria metrics for evaluation of data mining algorithms. In D. Heckerman, H. Mannila, D. Pregibon, & R. Uthurusamy (Eds.), Proceedings of the Fourth International Conference on Knowledge Discovery in Databases & Data Mining (pp. 37–42). AAAI Press.
Neave, H., & Worthington, P. (1992). Distribution-Free Tests. London: Routledge.
Pfahringer, B., Bensusan, H., & Giraud-Carrier, C. (2000). Tell me who can learn you and i can tell you who you are: Landmarking various learning algorithms. In P. Langley (Ed.), Proceedings of the Seventeenth International Conference on Machine Learning (ICML2000) (pp. 743–750). San Mateo, CA: Morgan Kaufmann.
Quinlan, R. (1998). C5.0: An Informal Tutorial. RuleQuest. Available at http://www.rulequest.com/see5-unix.html.
Ripley, B. (1996). Pattern Recognition and Neural Networks. Cambridge.
Schaffer, C. (1993). Selecting a classification method by cross-validation. Machine Learning, 13:1, 135–143.
Schmidhuber, J., Zhao, J., & Schraudolph, N. (1997). Reinforcement Learning With Self-Modifying Policies (pp. 293–309). Boston: Kluwer.
Soares, C., & Brazdil, P. (2000). Zoomed ranking: Selection of classification algorithms based on relevant performance information. In D. Zighed, J. Komorowski, & J. Zytkow (Eds.), Proceedings of the Fourth European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD2000) (pp. 126–135). New York: Springer.
Soares, C., Brazdil, P., & Costa, J. (2000). Measures to compare rankings of classification algorithms. In H. Kiers, J.-P. Rasson, P. Groenen, & M. Schader (Eds.), Data Analysis, Classification and Related Methods, Proceedings of the Seventh Conference of the International Federation of Classification Societies IFCS (pp. 119–124). New York: Springer.
Soares, C., Costa, J., & Brazdil, P. (2001a). Improved statistical support for matchmaking: Rank correlation taking rank importance into account. In JOCLAD 2001: VII Jornadas de Classificação e Análise de Dados (pp. 72–75).
Soares, C., Petrak, J., & Brazdil, P. (2001b). Sampling-based relative landmarks: Systematically test-driving algorithms before choosing. In P. Brazdil & A. Jorge (Eds.), Proceedings of the 10th Portuguese Conference on Artificial Intelligence (EPIA 2001) (pp. 88–94). New York: Springer.
Sohn, S. (1999). Meta analysis of classification algorithms for pattern recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 21:11, 1137–1144.
Suyama, A., Negishi, N., & Yamaguchi, T. (1999). CAMLET: A platform for automatic composition of inductive applications using ontologies. In C. Giraud-Carrier & B. Pfahringer (Eds.), Proceedings of the ICML-99 Workshop on Recent Advances in Meta-Learning and Future Work (pp. 59–65).
Todorovski, L., Brazdil, P., & Soares, C. (2000). Report on the experiments with feature selection in meta-level learning. In P. Brazdil & A. Jorge (Eds.), Proceedings of the Data Mining, Decision Support, Meta-Learning and ILP Workshop at PKDD2000 (pp. 27–39).
Todorovski, L., & Džeroski, S. (1999). Experiments in meta-level learning with ILP. In J. Rauch & J. Zytkow (Eds.), Proceedings of the Third European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD99) (pp. 98–106). New York: Springer.
Todorovski, L., & Džeroski, S. (2000). Combining multiple models with meta decision trees. In D. Zighed, J. Komorowski, & J. Zytkow (Eds.), Proceedings of the Fourth European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD00) (pp. 54–64). New York: Springer.
Todorovski, L., & Džeroski, S. (2003). Combining classifiers with meta-decision trees. Machine Learning Journal. 50:3 pp. 223–249.
Vilalta, R. (1999). Understanding accuracy performance through concept characterization and algorithm analysis. In C. Giraud-Carrier & B. Pfahringer (Eds.), Recent Advances in Meta-Learning and Future Work (pp. 3–9). J. Stefan Institute.
Wolpert, D., & Macready, W. (1996). No free lunch theorems for search. Technical Report SFI-TR-95-02-010, The Santa Fe Institute. Available at http://lucy.ipk.fhg.de:80/~stephan/nfl/nfl.ps
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Brazdil, P.B., Soares, C. & da Costa, J.P. Ranking Learning Algorithms: Using IBL and Meta-Learning on Accuracy and Time Results. Machine Learning 50, 251–277 (2003). https://doi.org/10.1023/A:1021713901879
Issue Date:
DOI: https://doi.org/10.1023/A:1021713901879