Hierarchical Meta-Rules for Scalable Meta-Learning

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8862)


The Pairwise Meta-Rules (PMR) method proposed in [18] has been shown to improve the predictive performances of several meta-learning algorithms for the algorithm ranking problem. Given m target objects (e.g., algorithms), the training complexity of the PMR method with respect to m is quadratic: \(\binom{m}{2} = m \times (m - 1) / 2\). This is usually not a problem when m is moderate, such as when ranking 20 different learning algorithms. However, for problems with a much larger m, such as the meta-learning-based parameter ranking problem, where m can be 100+, the PMR method is less efficient. In this paper, we propose a novel method named Hierarchical Meta-Rules (HMR), which is based on the theory of orthogonal contrasts. The proposed HMR method has a linear training complexity with respect to m, providing a way of dealing with a large number of objects that the PMR method cannot handle efficiently. Our experimental results demonstrate the benefit of the new method in the context of meta-learning.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Bauer, E., Kohavi, R.: An empirical comparison of voting classification algorithms: Bagging, boosting, and variants. Machine Learning 1(38) (1998)Google Scholar
  2. 2.
    Brazdil, P., Gama, J., Henery, B.: Characterizing the applicability of classification algorithms using meta-level learning. In: Bergadano, F., De Raedt, L. (eds.) ECML 1994. LNCS, vol. 784, pp. 83–102. Springer, Heidelberg (1994)CrossRefGoogle Scholar
  3. 3.
    Brazdil, P., Giraud-Carrier, C., Soares, C., Vilalta, R.: Metalearning: Applications to Data Mining. Springer (2009)Google Scholar
  4. 4.
    Brazdil, P., Soares, C., Da Costa, J.P.: Ranking learning algorithms: Using ibl and meta-learning on accuracy and time results. Mach. Learn. 50(3), 251–277 (2003)CrossRefzbMATHGoogle Scholar
  5. 5.
    Caruana, R., Niculescu-mizil, A.: An empirical comparison of supervised learning algorithms. In: Proc. 23rd Intl. Conf. Machine Learning (ICML 2006), pp. 161–168 (2006)Google Scholar
  6. 6.
    Chung, L., Marden, J.I.: Use of nonnull models for rank statistics in bivariate, two-sample, and analysis of variance problems. Journal of the American Statistical Association 86(413), 188–200 (1991)CrossRefzbMATHMathSciNetGoogle Scholar
  7. 7.
    Chung, L., Marden, J.I.: Extensions of mallows φ model. In: Probability Models and Statistical Analyses for Ranking Data, pp. 108–139. Springer (1993)Google Scholar
  8. 8.
    Cohen, W.W.: Fast effective rule induction. In: Proceedings of the 12th International Conference on Machine Learning. Morgan Kaufmann (1995)Google Scholar
  9. 9.
    Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.: The weka data mining software: An update. SIGKDD Explorations 11(1) (2009)Google Scholar
  10. 10.
    Kalousis, A.: Algorithm Selection via Meta-Learning. Ph.D. thesis, Department of Computer Science, University of Geneva (2002)Google Scholar
  11. 11.
    Kendall, M.G.: Rank correlation methods. Griffin (1970)Google Scholar
  12. 12.
    Marden, J.: Analyzing and Modeling Rank Data. Monographs on Statistics and Applied Probability. Chapman and Hall (1995)Google Scholar
  13. 13.
    Pfahringer, B., Bensusan, H., Giraud-Carrier, C.: Meta-learning by landmarking various learning algorithms. In: Proceedings of the 17th International Conference on Machine Learning (2000)Google Scholar
  14. 14.
    van Rijn, J.N., Holmes, G., Pfahringer, B., Vanschoren, J.: Algorithm selection on data streams. In: Džeroski, S., Panov, P., Kocev, D., Todorovski, L. (eds.) DS 2014. LNCS (LNAI), vol. 8777, pp. 325–336. Springer, Heidelberg (2014)CrossRefGoogle Scholar
  15. 15.
    Rossi, A.L.D., De Carvalho, A.C.P.D.L.F., Soares, C., De Souza, B.F.: Metastream: A meta-learning based method for periodic algorithm selection in time-changing data. Neurocomputing 127, 52–64 (2014)CrossRefGoogle Scholar
  16. 16.
    Steinbach, M., Karypis, G., Kumar, V.: A comparison of document clustering techniques. In: KDD Workshop on Text Mining (2000)Google Scholar
  17. 17.
    Sun, Q.: Meta-Learning and the Full Model Selection Problem. Ph.D. thesis, The University of Waikato (2014)Google Scholar
  18. 18.
    Sun, Q., Pfahringer, B.: Pairwise meta-rules for better meta-learning-based algorithm ranking. Machine Learning 93(1), 141–161 (2013)CrossRefzbMATHMathSciNetGoogle Scholar
  19. 19.
    Ward Jr., J.H.: Hierarchical grouping to optimize an objective function. Journal of the American Statistical Association 58(301), 236–244 (1963)CrossRefMathSciNetGoogle Scholar
  20. 20.
    Weiss, S.M., Kapouleas, I.: An empirical comparison of pattern recognition, neural nets, and machine learning classification methods. In: Proceedings of the Eleventh International Joint Conference on Artificial Intelligence, pp. 781–787. Morgan Kaufmann (1989)Google Scholar
  21. 21.
    Wolpert, D., Macready, W.: No free lunch theorems for optimization. IEEE Transactions on Evolutionary Computation 1(1), 67–82 (1997)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  1. 1.Department of Computer ScienceThe University of WaikatoHamiltonNew Zealand

Personalised recommendations