Machine Learning

, Volume 93, Issue 1, pp 141–161 | Cite as

Pairwise meta-rules for better meta-learning-based algorithm ranking



In this paper, we present a novel meta-feature generation method in the context of meta-learning, which is based on rules that compare the performance of individual base learners in a one-against-one manner. In addition to these new meta-features, we also introduce a new meta-learner called Approximate Ranking Tree Forests (ART Forests) that performs very competitively when compared with several state-of-the-art meta-learners. Our experimental results are based on a large collection of datasets and show that the proposed new techniques can improve the overall performance of meta-learning for algorithm ranking significantly. A key point in our approach is that each performance figure of any base learner for any specific dataset is generated by optimising the parameters of the base learner separately for each dataset.


Meta-learning Algorithm ranking Ranking trees Ensemble learning 


  1. Ali, S., & Smith-Miles, K. A. (2006). A meta-learning approach to automatic kernel selection for support vector machines. Neurocomputing, 70(1–3), 173–186. CrossRefGoogle Scholar
  2. Alvo, M., Cabilio, P., & Feigin, P. D. (1982). Asymptotic theory for measures of concordance with special reference to average Kendall tau. The Annals of Statistics, 10(4), 1269–1276. MathSciNetMATHCrossRefGoogle Scholar
  3. Bensusan, H., Giraud-Carrier, C., & Kennedy, C. (2000). A higher-order approach to meta-learning (Technical report). University of Bristol. Google Scholar
  4. Biau, G. (2012). Analysis of a random forests model. Journal of Machine Learning Research, 13, 1063–1095. MathSciNetGoogle Scholar
  5. Blockeel, H., Raedt, L. D., & Ramon, J. (1998). Top-down induction of clustering trees. In Proceedings of the fifteenth international conference on machine learning. San Mateo: Morgan Kaufmann. Google Scholar
  6. Brazdil, P., Gama, J., & Henery, B. (1994). Characterizing the applicability of classification algorithms using meta-level learning. In Proceedings of the European conference on machine learning. Google Scholar
  7. Brazdil, P., Soares, C., & Da Costa, J. P. (2003). Ranking learning algorithms: using ibl and meta-learning on accuracy and time results. Machine Learning, 50(3), 251–277. MATHCrossRefGoogle Scholar
  8. Brazdil, P., Giraud-Carrier, C., Soares, C., & Vilalta, R. (2009). Metalearning: applications to data mining. Berlin: Springer. Google Scholar
  9. Breiman, L. (1996). Bagging predictors. Machine Learning, 24(2), 123–140. MathSciNetMATHGoogle Scholar
  10. Breiman, L. (2001). Random forests. Machine Learning, 45, 5–32. MATHCrossRefGoogle Scholar
  11. Cheng, W., & Hüllermeier, E. (2008). Instance-based label ranking using the mallows model. In Workshop proceedings of preference learning, Antwerp, Belgium. Google Scholar
  12. Cheng, W., Hühn, J., & Hüllermeier, E. (2009). Decision tree and instance-based learning for label ranking. In Proceedings of the 26th international conference on machine learning (ICML-09), Montreal, Canada (pp. 161–168). Google Scholar
  13. Cohen, W. W. (1995). Fast effective rule induction. In Proceedings of the 12th international conference on machine learning. San Mateo: Morgan Kaufmann. Google Scholar
  14. de Miranda, P., Prudencio, R., Carvalho, A., & Soares, C. (2012). Combining a multi-objective optimization approach with meta-learning for svm parameter selection. In 2012 IEEE international conference on systems, man, and cybernetics (pp. 2909–2914). CrossRefGoogle Scholar
  15. Efron, B., & Tibshirani, R. J. (1993). An introduction to the bootstrap. London: Chapman & Hall. MATHGoogle Scholar
  16. Escalante, H. J., Montes, M., & Sucar, L. E. (2009). Particle swarm model selection. Journal of Machine Learning Research, 10, 405–440. Google Scholar
  17. Giraud-Carrier, C. (2008). Metalearning—a tutorial. In Proceedings of the 7th international conference on machine learning and applications. San Mateo: Morgan Kaufmann. Google Scholar
  18. Gomes, T. A., Prudêncio, R. B., Soares, C., Rossi, A. L., & Carvalho, A. (2012). Combining meta-learning and search techniques to select parameters for support vector machines. Neurocomputing, 75(1), 3–13. CrossRefGoogle Scholar
  19. Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., & Witten, I. H. (2009). The weka data mining software: an update. ACM SIGKDD Explorations Newsletter, 11(1), 10–18. CrossRefGoogle Scholar
  20. Hüllermeier, E., Fürnkranz, J., Cheng, W., & Brinker, K. (2008). Label ranking by learning pairwise preferences. Artificial Intelligence, 172(16–17), 1897–1916. MathSciNetMATHCrossRefGoogle Scholar
  21. Jankowski, N., Duch, W., & Grabczewski, K. (Eds.) (2011). Studies in computational intelligence: Vol. 358. Meta-learning in computational intelligence. Berlin: Springer. MATHGoogle Scholar
  22. Järvelin, K., & Kekäläinen, J. (2002). Cumulated gain-based evaluation of ir techniques. ACM Transactions on Information Systems, 20(4), 422–446. CrossRefGoogle Scholar
  23. Kalousis, A. (2002). Algorithm selection via meta-learning. PhD thesis, Department of Computer Science, University of Geneva. Google Scholar
  24. Kalousis, A., & Hilario, M. (2001). Model selection via meta-learning: a comparative study. International Journal on Artificial Intelligence Tools, 10(04), 525–554. CrossRefGoogle Scholar
  25. Kendall, M. G. (1970). Rank correlation methods. London: Griffin. MATHGoogle Scholar
  26. Leite, R., & Brazdil, P. (2005). Predicting relative performance of classifiers from samples. In Proceedings of the 22nd international conference on machine learning. Google Scholar
  27. Leite, R., Brazdil, P., & Vanschoren, J. (2012a). Selecting classification algorithm with active testing on similar datasets. In Proceedings of the 5th international workshop on planning to learn. Google Scholar
  28. Leite, R., Brazdil, P., & Vanschoren, J. (2012b). Selecting classification algorithms with active testing. In P. Perner (Ed.), Lecture notes in computer science: Vol. 7376. Machine learning and data mining in pattern recognition (pp. 117–131). Berlin Heidelberg: Springer. CrossRefGoogle Scholar
  29. Li, H. (2011). Learning to rank for information retrieval and natural language processing. Synthesis Lectures on Human Language Technologies, 4(1), 1–113. CrossRefGoogle Scholar
  30. Marden, J. I. (1995). Analyzing and modeling rank data. London: Chapman & Hall. MATHGoogle Scholar
  31. Pfahringer, B., Bensusan, H., & Giraud-Carrier, C. (2000). Meta-learning by landmarking various learning algorithms. In Proceedings of the 17th international conference on machine learning. Google Scholar
  32. Pinto da Costa, J., & Soares, C. (2005). A weighted rank measure of correlation. Australian & New Zealand Journal of Statistics, 47(4), 515–529. MathSciNetMATHCrossRefGoogle Scholar
  33. Reif, M., Shafait, F., & Dengel, A. (2012). Meta-learning for evolutionary parameter optimization of classifiers. Machine Learning, 87, 357–380. MathSciNetCrossRefGoogle Scholar
  34. Schaffer, C. (1994). A conservation law for generalization performance. In Proceedings of the 11th international conference on machine learning (pp. 259–265). San Mateo: Morgan Kaufmann. Google Scholar
  35. Serban, F., Vanschoren, J., Kietz, J.-U., & Bernstein, A. (2013). A survey of intelligent assistants for data analysis. ACM Computing Surveys. doi: 10.5167/uzh-73010. Google Scholar
  36. Smith-Miles, K. A. (2009). Cross-disciplinary perspectives on meta-learning for algorithm selection. ACM Computing Surveys, 41(1), 6:1–6:25. Google Scholar
  37. Soares, C. (2004). Learning ranking of learning algorithms. PhD thesis, Department of Computer Science, University of Porto. Google Scholar
  38. Soares, C., Brazdil, P. B., & Kuba, P. (2004). A meta-learning method to select the kernel width in support vector regression. Machine Learning, 54(3), 195–209. MATHCrossRefGoogle Scholar
  39. Sun, Q., Pfahringer, B., & Mayo, M. (2012). Full model selection in the space of data mining operators. In Proceedings of the 14th international conference on genetic and evolutionary computation conference companion. Google Scholar
  40. Todorovski, L., Blockeel, H., & Dzeroski, S. (2002). Ranking with predictive clustering trees. In Proceedings of the 13th European conference on machine learning. Berlin: Springer. Google Scholar
  41. Wolpert, D. H. (1992). Stacked generalization. Neural Networks, 5, 241–259. CrossRefGoogle Scholar
  42. Wolpert, D., & Macready, W. (1997). No free lunch theorems for optimization. IEEE Transactions on Evolutionary Computation, 1(1), 67–82. CrossRefGoogle Scholar
  43. Xu, J., & Li, H. (2007). Adarank: a boosting algorithm for information retrieval. In Proceedings of the 30th international conference on research and development in information retrieval. New York: ACM. Google Scholar

Copyright information

© The Author(s) 2013

Authors and Affiliations

  1. 1.Department of Computer ScienceThe University of WaikatoHamiltonNew Zealand

Personalised recommendations