Skip to main content

Towards Explainable Meta-learning

  • Conference paper
  • First Online:
Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD 2021)

Abstract

Meta-learning is a field that aims at discovering how different machine learning algorithms perform on a wide range of predictive tasks. Such knowledge speeds up the hyperparameter tuning or feature engineering. With the use of surrogate models, various aspects of the predictive task such as meta-features, landmarker models, etc., are used to predict expected performance. State-of-the-art approaches focus on searching for the best meta-model but do not explain how these different aspects contribute to its performance. However, to build a new generation of meta-models, we need a deeper understanding of the importance and effect of meta-features on model tunability. This paper proposes techniques developed for eXplainable Artificial Intelligence (XAI) to examine and extract knowledge from black-box surrogate models. To our knowledge, this is the first paper that shows how post-hoc explainability can be used to improve meta-learning.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Balte, A., Pise, N., Kulkarni, P.: Meta-learning with landmarking: a survey. Int. J. Comput. Appl. 105(8), 47–51 (2014)

    Google Scholar 

  2. Biecek, P.: Dalex: explainers for complex predictive models in R. J. Mach. Learn. Res. 19(1), 3245–3249 (2018)

    Google Scholar 

  3. Biecek, P., Burzykowski, T.: Explanatory Model Analysis. Explore, Explain and Examine Predictive Models. CRC Press, Boca Raton (2021). https://pbiecek.github.io/ema/

  4. Bilalli, B., Abelló, A., Aluja-Banet, T.: On the predictive power of meta-features in openML. Int. J. Appl. Math. Comput. Sci. 27(4), 697–712 (2017)

    Article  MathSciNet  Google Scholar 

  5. Bischl, B., et al.: OpenML benchmarking suites and the openml100. Stat 1050, 11 (2017)

    Google Scholar 

  6. Brazdil, P., Carrier, C.G., Soares, C., Vilalta, R.: Metalearning: Applications to Data Mining. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-73263-1

    Book  MATH  Google Scholar 

  7. Chen, T., Guestrin, C.: Xgboost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 785–794 (2016)

    Google Scholar 

  8. Davis, C., Giraud-Carrier, C.: Annotative experts for hyperparameter selection. In: AutoML Workshop at ICML (2018)

    Google Scholar 

  9. Dorogush, A.V., Ershov, V., Gulin, A.: Catboost: gradient boosting with categorical features support. arXiv preprint arXiv:1810.11363 (2018)

  10. Edwards, H., Storkey, A.: Towards a neural statistician. arXiv preprint arXiv:1606.02185 (2016)

  11. Feurer, M., Springenberg, J.T., Hutter, F.: Initializing bayesian hyperparameter optimization via meta-learning. In: Twenty-Ninth AAAI Conference on Artificial Intelligence (2015)

    Google Scholar 

  12. Finn, C., Abbeel, P., Levine, S.: Model-agnostic meta-learning for fast adaptation of deep networks. In: International Conference on Machine Learning, pp. 1126–1135. PMLR (2017)

    Google Scholar 

  13. Finn, C., Xu, K., Levine, S.: Probabilistic model-agnostic meta-learning. arXiv preprint arXiv:1806.02817 (2018)

  14. Fisher, A., Rudin, C., Dominici, F.: All models are wrong, but many are useful: learning a variable’s importance by studying an entire class of prediction models simultaneously. J. Mach. Learn. Res. 20(177), 1–81 (2019)

    MathSciNet  MATH  Google Scholar 

  15. Friedman, J.H.: Greedy function approximation: a gradient boosting machine. Ann. Stat. 29, 1189–1232 (2001)

    Article  MathSciNet  Google Scholar 

  16. Friedman, J.H., Popescu, B.E., et al.: Predictive learning via rule ensembles. Ann. Appl. Stat. 2(3), 916–954 (2008)

    Article  MathSciNet  Google Scholar 

  17. Giraud-Carrier, C., et al.: A meta-learning assistant for providing user support in data mining and machine learning (1999–2001)

    Google Scholar 

  18. Goldstein, A., Kapelner, A., Bleich, J., Pitkin, E.: Peeking inside the black box: visualizing statistical learning with plots of individual conditional expectation. J. Comput. Graph. Stat. 24(1), 44–65 (2015)

    Article  MathSciNet  Google Scholar 

  19. Greenwell, B., Boehmke, B., Cunningham, J., Developers, G.: GBM: generalized boosted regression models (2020). https://CRAN.R-project.org/package=gbm, r package version 2.1.8

  20. Guerra, S.B., Prudêncio, R.B.C., Ludermir, T.B.: Predicting the performance of learning algorithms using support vector machines as meta-regressors. In: Kůrková, V., Neruda, R., Koutník, J. (eds.) ICANN 2008. LNCS, vol. 5163, pp. 523–532. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-87536-9_54

    Chapter  Google Scholar 

  21. Hewitt, L.B., Nye, M.I., Gane, A., Jaakkola, T., Tenenbaum, J.B.: The variational homoencoder: learning to learn high capacity generative models from few examples. arXiv preprint arXiv:1807.08919 (2018)

  22. Hospedales, T., Antoniou, A., Micaelli, P., Storkey, A.: Meta-learning in neural networks: a survey. arXiv preprint arXiv:2004.05439 (2020)

  23. Hutter, F., Hoos, H.H., Leyton-Brown, K.: Sequential model-based optimization for general algorithm configuration. In: Coello, C.A.C. (ed.) LION 2011. LNCS, vol. 6683, pp. 507–523. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-25566-3_40

    Chapter  Google Scholar 

  24. Jomaa, H.S., Schmidt-Thieme, L., Grabocka, J.: Dataset2vec: learning dataset meta-features. Data Min. Knowl. Disc. 35(3), 964–985 (2021)

    Article  MathSciNet  Google Scholar 

  25. Ke, G., et al.: Lightgbm: a highly efficient gradient boosting decision tree. Adv. Neural Inf. Process. Syst. 30, 3146–3154 (2017)

    Google Scholar 

  26. King, R.D., Feng, C., Sutherland, A.: Statlog: comparison of classification algorithms on large real-world problems. Appl. Artif. Intell. Int. J. 9(3), 289–333 (1995)

    Article  Google Scholar 

  27. Leite, R., Brazdil, P., Vanschoren, J.: Selecting Classification Algorithms with Active Testing on Similar Datasets. Technical report (2012)

    Google Scholar 

  28. Lorena, A.C., Maciel, A.I., de Miranda, P.B., Costa, I.G., Prudêncio, R.B.: Data complexity meta-features for regression problems. Mach. Learn. 107(1), 209–246 (2018)

    Article  MathSciNet  Google Scholar 

  29. Molnar, C.: Interpretable Machine Learning (2019). https://christophm.github.io/interpretable-ml-book/

  30. Molnar, C., Casalicchio, G., Bischl, B.: IML: an R package for interpretable machine learning. J. Open Source Softw. 3(26), 786 (2018)

    Article  Google Scholar 

  31. Pekala, K., Woznica, K., Biecek, P.: Triplot: model agnostic measures and visualisations for variable importance in predictive models that take into account the hierarchical correlation structure. arXiv preprint arXiv:2104.03403 (2021)

  32. Pfahringer, B., Bensusan, H., Giraud-Carrier, C.G.: Meta-learning by landmarking various learning algorithms. In: ICML, pp. 743–750 (2000)

    Google Scholar 

  33. Pinto, F., Cerqueira, V., Soares, C., Mendes-Moreira, J.: autobagging: learning to rank bagging workflows with metalearning. arXiv preprint arXiv:1706.09367 (2017)

  34. Probst, P., Boulesteix, A.L., Bischl, B.: Tunability: importance of hyperparameters of machine learning algorithms. J. Mach. Learn. Res. 20(53), 1–32 (2019)

    MathSciNet  MATH  Google Scholar 

  35. Reif, M., Shafait, F., Goldstein, M., Breuel, T., Dengel, A.: Automatic classifier selection for non-experts. Pattern Anal. Appl. 17(1), 83–96 (2014)

    Article  MathSciNet  Google Scholar 

  36. Rendell, L., Cho, H.: Empirical learning as a function of concept character. Mach. Learn. 5(3), 267–298 (1990)

    Google Scholar 

  37. Rivolli, A., Garcia, L.P.F, Soares, C., Vanschoren, J.: Towards Reproducible Empirical Research in Meta-learning. Technical report (2018). https://CRAN.R-project.org/package=mfe

  38. Van Rijn, J.N., Hutter, F.: Hyperparameter importance across datasets. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 2367–2376 (2018)

    Google Scholar 

  39. Vanschoren, J.: Meta-learning. In: Hutter, F., Kotthoff, L., Vanschoren, J. (eds.) Automated Machine Learning. TSSCML, pp. 35–61. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-05318-5_2

    Chapter  Google Scholar 

  40. Vilalta, R., Giraud-Carrier, C.G., Brazdil, P., Soares, C.: Using meta-learning to support data mining. Int. J. Comput. Sci. Appl. 1(1), 31–45 (2004)

    MATH  Google Scholar 

  41. Wistuba, M., Schilling, N., Schmidt-Thieme, L.: Learning hyperparameter optimization initializations. In: 2015 IEEE International Conference on Data Science and Advanced Analytics (DSAA), pp. 1–10. IEEE (2015)

    Google Scholar 

  42. Wistuba, M., Schilling, N., Schmidt-Thieme, L.: Sequential model-free hyperparameter tuning. In: Proceedings - IEEE International Conference on Data Mining, ICDM, vol. 2016-January, pp. 1033–1038. Institute of Electrical and Electronics Engineers Inc. (2016). https://doi.org/10.1109/ICDM.2015.20

  43. Wistuba, M., Schilling, N., Schmidt-Thieme, L.: Scalable gaussian process-based transfer surrogates for hyperparameter optimization. Mach. Learn. 107(1), 43–78 (2018)

    Article  MathSciNet  Google Scholar 

  44. Yogatama, D., Mann, G.: Efficient transfer learning method for automatic hyperparameter tuning. In: Artificial Intelligence and Statistics, pp. 1077–1085 (2014)

    Google Scholar 

Download references

Acknowledgements

The work on this paper is financially supported by the NCN Opus grant 2017/27/B/ST6/01307.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Katarzyna Woźnica .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Woźnica, K., Biecek, P. (2021). Towards Explainable Meta-learning. In: Kamp, M., et al. Machine Learning and Principles and Practice of Knowledge Discovery in Databases. ECML PKDD 2021. Communications in Computer and Information Science, vol 1524. Springer, Cham. https://doi.org/10.1007/978-3-030-93736-2_38

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-93736-2_38

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-93735-5

  • Online ISBN: 978-3-030-93736-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics