Skip to main content

A Perspective View and Survey of Meta-Learning

Abstract

Different researchers hold different views of what the term meta-learning exactlymeans. The first part of this paper provides our own perspective view in which the goal isto build self-adaptive learners (i.e. learning algorithms that improve their bias dynamicallythrough experience by accumulating meta-knowledge). The second part provides a survey ofmeta-learning as reported by the machine-learning literature. We find that, despite differentviews and research lines, a question remains constant: how can we exploit knowledge aboutlearning (i.e. meta-knowledge) to improve the performance of learning algorithms? Clearlythe answer to this question is key to the advancement of the field and continues being thesubject of intensive research.

This is a preview of subscription content, access via your institution.

References

  1. Aha David, W. (1992). Generalizing from Case Studies: A Case Study. Proceedings of the Ninth International Workshop on Machine Learning, 1–10. Morgan Kaufman.

  2. Ali Kamal & Pazzani Michael, J. (1996). Error Reduction Through Learning Model Descrip-tions. Machine Learning 24: 173–202.

    Google Scholar 

  3. Baltes Jacky (1992). Case-Based Meta Learning: Sustained Learning Supported by a Dynami-cally Biased Version Space. Proceedings of the Machine Learning Workshop on Biases in Inductive Learning.

  4. Baum Eric, B. (1998). Manifesto for an Evolutionary Economics of Intelligence. Neural Networks and Machine Learning, 285–344. Editor C.M. Bishop, Springer-Verlag.

  5. Baxter Jonathan (1998). Theoretical Models of Learning to Learn. Learning to Learn 4: 71–94. Kluwer Academic Publishers, MA.

    Google Scholar 

  6. Baxter Jonathan (2000). A Model of Inductive Learning Bias. Journal of Artificial Intelligence Research 12: 149–198.

    Google Scholar 

  7. Bensusan Hilan & Giraud-Carrier Christophe (2000). Casa Batlo in Passeig or landmarking the expertise space. Eleventh European Conference on Machine Learning, Workshop on Meta-Learning: Building Automatic Advice Strategies for Model Selection and Method Combination. Barcelona, Spain.

  8. Bensusan Hilan, Giraud-Carrier Christophe & Kennedy, C. J. (2000). A High-Order Approach to Meta-Learning. Eleventh European Conference on Machine Learning, Workshop on Meta-Learning: Building Automatic Advice Strategies for Model Selection and Method Combination. Barcelona, Spain.

  9. Blumer, A., Ehrenfeucht, A., Hausler, D. & Warmuth, M. K. (1989). Learnability And The Vapnik-Chervonenkis Dimension. Journal of the ACM 36: 929–965.

    Google Scholar 

  10. Booker, L., Goldberg, D. & Holland, J. (1989). Classifier Systems and Genetic Algorithms. Artificial Intelligence 40: 235–282.

    Google Scholar 

  11. Brazdil Pavel, B. (1998). Data Transformation and model selection by experimentation and meta-learning. Proceedings of the ECML-98 Workshop on Upgrading Learning to Meta-Level: Model Selection and Data Transformation, 11–17. Technical University of Chemnitz.

  12. Brazdil Pavel, B. & Soares Carlos (2000). Ranking Classification Algorithms Based on Relevant Performance Information. Eleventh European Conference on Machine Learning, Workshop on Meta-Learning: Building Automatic Advice Strategies for Model Selection and Method Combination. Barcelona, Spain.

  13. Breiman (1996). Bagging Predictors. Machine Learning 24: 123–140.

    Google Scholar 

  14. Brodley Carla (1993). Addressing the Selective Superiority Problem: Automatic Algorithm/ Model Class Selection. Proceedings of the Tenth International Conference on Machine Learning, 17–24. San Mateo, CA, Morgan Kaufman.

  15. Brodley Carla (1994). Recursive Automatic Bias Selection for Classifier Construction. Machine Learning 20.

  16. Brodley Carla & Lane, T. (1996). Creating and Exploiting Coverage and Diversity. Procee-dings of the AAAI-96 Workshop on Integrating Multiple Learned Models, 8–14. Portland, Oregon.

  17. Bruha Ivan (2000). A feedback loop for refining rule qualities in a classifier: a reward-penalty strategy. Eleventh European Conference on Machine Learning, Workshop on Meta-Learning: Building Automatic Advice Strategies for Model Selection and Method Combination. Barcelona, Spain.

  18. Caruana Rich (1997). Multitask Learning. Second Special Issue on Inductive Transfer. Machine Learning 28: 41–75.

    Google Scholar 

  19. Caruana Rich (1998). Multitask Learning.

  20. Chan Philip, K. & Stolfo, S. (1998). On the Accuracy of Meta-Learning for Scalable Data Mining. In Kerschberg, L. (ed.) Journal of Intelligent Integration of Information.

  21. Chan Philip, K. & Stolfo, S. (1993). Experiments on Multistrategy Learning by Meta-Learning. Proceedings of the International Conference on Information Knowledge Management, 314–323.

  22. Chan Philip, K. (1996). An Extensible Meta-Learning Approach for Scalable and Accurate Inductive Learning. PhD Thesis, Graduate School of arts and Sciences at Columbia University.

  23. Cohen Paul & Feigenbaum Edward (1989). Learning and Inductive Inference. The Handbook of Artificial Intelligence, Volume III, 326–334. Addison-Wesley.

  24. DesJardins Marie & Gordon Diana, F. (1995A). Special issue on bias evaluation and selection. Machine Learning 20(1/2).

  25. DesJardins Marie & Gordon Diana, F. (1995B). Evaluation and Selection of Biases in Machine Learning. Machine Learning 20: 5–22.

    Google Scholar 

  26. Domingos Pedro (1997). Knowledge Acquisition from Examples Via Multiple Models. Proceedings of the Fourteenth International Conference on Machine Learning, 98–106. Morgan Kaufmann, Nashville TN.

  27. Domingos Pedro (1998). Knowledge Discovery Via Multiple Models. Intelligent Data Analysis 2: 187–202.

    Google Scholar 

  28. Fan Wei, Stolfo, S. & Chan Philip, K. (1999). Using Conflicts Among Multiple Base Classifiers to Measure the Performance of Stacking. In Giraud-Carrier Christophe & Pfahringer Bernhard (eds.) Proceedings of the ICML-99 Workshop on Recent Advances in Meta-Learning and Future Work, 10–15. Stefan Institute Publisher, Ljubljana.

    Google Scholar 

  29. Freund, Y. & Schapire, R. E. (1996). Experiments with a new boosting algorithm. Procee-dings of the Thirteenth International Conference on Machine Learning, 148–156. Morgan Kaufman, Bari, Italy.

  30. Gama, J. & Brazdil, P. (1995). Characterization of Classification Algorithms. Proceedings of the seventh Portuguese Conference on Artificial Intelligence, EPIA, 189–200. Funchal, Madeira Island, Portugal.

  31. Giraud-Carrier Christophe (1998). Beyond Predictive Accuracy: What?. Proceedings of the ECML-98 Workshop on Upgrading Learning to Meta-Level: Model Selection and Data Transformation, 78–85. Technical University of Chemnitz.

  32. Goel Ashok, K. (1996). Meta-Cases: Explaining Case-Based Reasoning. Proceedings of the Third European Workshop on Case-Based Reasoning. Published in Advances in Case-Based Reasoning, Lecture Notes in Computer Science, 1168. Springer, New York.

    Google Scholar 

  33. Gordon Diana & Perlis Donald (1989). Explicitly Biased Generalization. Computational Intelligence 5: 67–81.

    Google Scholar 

  34. Gordon Diana, F. (1992). Queries for Bias Testing. Proceedings of the Workshop on Change of Representation and Problem Reformulation.

  35. Gordon Diana, F. (1990). Active Bias Adjustment for Incremental, Supervised Concept Learning. PhD Thesis, University of Maryland.

  36. Holland John, Booker Lashon, Colombetti Marco, Dorigo Marco, Goldberg David, Forrest Stephanie, Riolo Rick, Smith Robert, Lanzi Pier Luca, Stolzmann Wolfgang & Wilson Stewart (2000). What Is a Learning Classifier System?. Lecture Notes in Artificial Intelligence LNAI 181 3, 3–22. Springer Verlag.

    Google Scholar 

  37. Holland John (1975). Adaptation in Natural and Artificial Systems. University of Michigan Press, Ann Arbor (Republished by the MIT Press, 1992).

    Google Scholar 

  38. Holland John & Reitman, J. (1978). Cognitive Systems Based On Adaptive Algorithms. In Waterman, D. A. & Hayes Roth, F. (eds.) Pattern-directed inference systems. NewYork: Academic Press, Springer Verlag.

  39. Keller Jorg, Paterson Iain & Berrer Helmutt (2000). An Integrated Concept for Multi-Crieria-Ranking of Data-Mining Algorithms. Eleventh European Conference on Machine Learning, Workshop on Meta-Learning: Building Automatic Advice Strategies for Model Selection and Method Combination. Barcelona, Spain.

  40. Kohavi Ron (1995). A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection. Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence, 1137–1143.

  41. Lanzi Pier Luca, Stolzmann Wolfgang & Wilson Stewart, W. (2000). Learning Classifier Systems: From Foundations to Applications. Lecture Notes in Artificial Intelligence 1813. Springer-Verlag, New York.

    Google Scholar 

  42. Li Ming & Vitanyi Paul (1997). An Introduction to Kolmogorov Complexity and Its Applica-tions. Springer-Verlag, New York.

  43. Merz Christopher, J. (1995A). Dynamic Learning Bias Selection. Preliminary papers of the Fifth International Workshop on Artificial Intelligence and Statistics, 386–395. Florida.

  44. Merz Christopher, J. (1995B). Dynamical Selection of Learning Algorithms. In Fisher, D. & Lenz, H. J. (eds.) Learning from Data: Artificial Intelligence and Statistics. Springer-Verlag.

  45. Michie, D., Spiegelhalter, D. J. & Taylor, C. C. (1994). Machine Learning, Neural and Statistical Classification. Ellis Horwood, Chichester, England.

  46. Minton Steve (1993). An Analytic Learning System for Specialized Heuristics. Proceedings of Thirteenth International Joint Conference on Artificial Intelligence.

  47. Minton Steve (1989). Explanation Based-Learning: A problem Solving Perspective. Artificial Intelligence 40: 63–118.

    Google Scholar 

  48. Mitchell Tom (1980). The need for biases in learning generalizations. Technical Report CBM-TR-117. Computer Science Department, Rutgers University, New Brunswick, NJ 08904.

  49. Mitchell Tom (1997). Machine Learning. Ed. MacGraw-Hill.

  50. Pfahinger Bernhard, Bensusan Hilan & Giraud-Carrier Christophe (2000). Meta-Learning by Landmarking Various Learning Algorithms. Proceedings of the Seventeenth International Conference on Machine Learning. Stanford, CA.

  51. Prasad, M. V. Nagendra & Lesser Victor, R. (1997). The Use of Meta-Level Information in Learning Situation-Specific Coordination. Proceedings of the Fifteenth International Joint Conference on Artificial Intelligence. Nagoya, Japan.

  52. Pratt Lorien & Thrun Sebastian (1997). Second Special Issue on Inductive Transfer. Machine Learning 28.

  53. Pratt Sebastian & Jennings Barbara (1998). A Survey of Connectionist Network Reuse Through Transfer. Learning to Learn 2: 19–43. Kluwer Academic Publishers, MA.

    Google Scholar 

  54. Prodromidis Andreas, L., Chan Philip K. & Stolfo, S. (1999). Meta-Learning in Distributed Data Mining Systems: Issues and Approaches. In Kargupta & Chan (eds.) Advances in Distributed Data Mining. Book AAAI Press.

  55. Prodromidis Andreas, L. & Stolfo, S. (1999A). A Comparative Evaluation of Meta-Learning Strategies over Large and Distributed Data Sets. In Giraud-Carrier Christophe and Pfahringer Bernhard (eds.) Proceedings of the ICML-99 Workshop on Recent Advances in Meta-Learning and Future Work, 18–27. Stefan Institute Publisher, Ljubljana.

    Google Scholar 

  56. Prodromidis Andreas, L. & Stolfo, S. (1999B). Minimal Cost Complexity Pruning of Meta-Classifiers. Proceedings of AAAI, Extended Abstract.

  57. Rao, R. B., Gordon, D. & Spears, W. (1995). For Every Generalization Action, Is There Really an Equal and Opposite Reaction? Analysis of the Conservation Law for Generalization Performance. Proceedings of the Twelfth International Conference on Machine Learning, 471–479. Morgan Kaufman.

  58. Rendell Larry, Seshu Raj & Tcheng David (1987A). More Robust Concept Learning Using Dynamically-Variable Bias. Proceedings of the Fourth International Workshop on Machine Learning, 66–78. Morgan Kaufman.

  59. Rendell Larry, Seshu Raj & Tcheng David (1987B). Layered Concept-Learning and Dynamically-Variable Bias Management. Proceedings of the International Joint Confe-rence of Artificial Intelligence, 308–314. Milan, Italy.

  60. Ring Mark, B. (1998). CHILD: A first Step Towards Continual Learning. Learning to Learn 11, 261–292. Kluwer Academic Publishers, MA.

    Google Scholar 

  61. Schaffer, C. (1994). A Conservation Law for Generalization Performance. Proceedings of the eleventh International Conference on Machine Learning, 259–265. San Francisco, Morgan Kaufman.

  62. Schmidhuber Jurgen (1995). Discovering Solutions with Low Kolmogorov Complexity and High Generalization Capability. Proceedings of the Twelve International Conference on Machine Learning, 488–449. Morgan Kaufman.

  63. Sutton Richard and Barto Andrew (1995). Reinforcement Learning. MIT Press, Cambridge Massachusetts.

    Google Scholar 

  64. Thrun Sebastian & Mitchell Tom (1995). Learning One More Thing. Proceedings of the International Joint Conference on Artificial Intelligence, 1217–1223. Morgan Kaufman.

  65. Thrun Sebastian & Lorien Pratt (1998). Learning To Learn: Introduction And Overview. Learning to Learn 1: 3–17. Kluwer Academic Publishers, MA.

    Google Scholar 

  66. Thrun Sebastian (1998). Lifelong Learning Algorithms. Learning to Learn 8: 181–209. Kluwer Academic Publishers, MA.

    Google Scholar 

  67. Thrun Sebastian & O'sullivan Joseph (1998). Clustering Learning Tasks and the Selective Cross-Task Transfer of Knowledge. Learning to Learn 10: 235–257. Kluwer Academic Publishers, MA.

    Google Scholar 

  68. Todorovski Ljupco & Dzeroski Saso (2000). Combining Multiple Models with Meta Decision Trees. Eleventh European Conference on Machine Learning, Workshop on Meta-Learning: Building Automatic Advice Strategies for Model Selection and Method Combination. Barcelona, Spain.

  69. Utgoff Paul (1986). Shift of Bias for Inductive Concept Learning. In Michalski, R. S. et al. (eds.) Machine Learning: An Artificial Intelligence Approach Vol.II, 107–148. Morgan Kaufman, California.

  70. Valiant, L. G. (1984). A Theory Of The Learnable. Comm.ACM 27: 1134–1142.

    Google Scholar 

  71. Vilalta, R. (1998). On the Development of Inductive Learning Algorithms: Generating Flexible And Adaptable Concept Representations. PhD Thesis, University of Illinois at Urbana-Champaign.

  72. Vilalta, R. (2001). Research Directions in Meta-Learning: Building Self-Adaptive Learners. International Conference on Artificial Intelligence. Las Vegas, Nevada.

  73. Watanabe Satosi (1969). Knowing and Guessing, A Formal and Quantitative Study. John Wiley & Sons Inc.

  74. Watanabe Satosi (1985). Pattern Recognition: Human and Mechanical. John Wiley & Sons Inc.

  75. Wolpert, D. (1992). Stacked Generalization. Neural Networks 5: 241–259.

    Google Scholar 

  76. Wolpert, D. (1996). The Lack of a Priori Distinctions Between Learning Algorithms and the Existence of a Priori Distinctions Between Learning Algorithms. Neural Computation 8.

Download references

Author information

Affiliations

Authors

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Vilalta, R., Drissi, Y. A Perspective View and Survey of Meta-Learning. Artificial Intelligence Review 18, 77–95 (2002). https://doi.org/10.1023/A:1019956318069

Download citation

  • classification
  • inductive learning
  • meta-knowledge