The Supervised Learning No-Free-Lunch Theorems



This paper reviews the supervised learning versions of the no-free-lunch theorems in a simplified form. It also discusses the significance of those theorems, and their relation to other aspects of supervised learning.


Cross Validation Error Function Supervise Learning Generalization Error Misclassification Rate 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Buntine, W., and A. Weigend. “Bayesian Back-Propagation.” Complex Systems 5 (1991): 603–643.Google Scholar
  2. 2.
    Dietterich, T. “Machine Learning.”Ann. Rev. Comp. Sci..4 (1990): 255–306.Google Scholar
  3. 3.
    Duda, R.O., P.E. Hart, and D.G. Stork.Pattern Classification (2nd ed.). Wiley & Sons, 2000.Google Scholar
  4. 4.
    Blumer, A., A. Ehrenfeueht, D. Haussler, and M. Warmuth. “Occam’s Razor.”Info. Proc. Lett.24 (1987): 377–380.Google Scholar
  5. 5.
    Titterington, A., F. Smith, and V. E. Makov.Statistical Analysis of Finite Mixture Distributions. New York: Wiley & Sons, 1985.zbMATHGoogle Scholar
  6. 6.
    Neal, R. “Priors for Infinite Networks.” Technical Report CRG-TR-94–1, Department of Computer Science, University of Toronto, 1994.Google Scholar
  7. 7.
    Weiss, S. M., and C. A. Kulikowski.Computer Systems that Learn. San Mateo, CA: Morgan Kauffman, 1991.Google Scholar
  8. 8.
    Wolpert, D. “Filter Likelihoods and Exhaustive Learning.” To appear in Computational Learning Theory and Natural Learning Systems. Volume II: Natural Learning Systems, edited by S. Hanson et al. Cambridge, MA: MIT Press, 1994.Google Scholar
  9. 9.
    Wolpert, D. “On the Connection Between In-Sample Testing and Generalization Error.” Complex Systems 6 (1992): 47–94.Google Scholar
  10. 10.
    Wolpert, D. “Bayesian Back-propagation over I-O Functions Rather than Weights.” in Neural Information Processing Systems 6, edited by S. Hanson et al. San Mateo, CA: Morgan-Kauffman, 1994.Google Scholar
  11. 11.
    Wolpert, D. “The Relationship Between PAC, the Statistical Physics Framework, the Bayesian Framework, and the VC Framework”, In The Mathematics of Generalization, edited by D. H. Wolpert, Addison-Wesley, Reading MA, 1995.Google Scholar
  12. 12.
    Wolpert, D. “The Lack of a priori Distinctions between Learning Algorithms and the Existence of a priori Distinctions between Learning Algorithms” Neural Computation 8 (1996): 1341–1390, 1391–1421.Google Scholar
  13. 13.
    Wolpert, D. The Mathematics of Generalization, edited by D. H. Wolpert Addison-Wesley, Reading MA, 1995.Google Scholar
  14. 14.
    Wolpert, D. “On Bias Plus Variance” Neural Computation 9 (1996): 1211–1244.Google Scholar
  15. 15.
    Wolpert, D. “Reconciling Bayesian and Non-Bayesian Analysis”, in Maximum Entropy and Bayesian Methods, edited by G. Heidbreder. Boston: Kluwer, 1994.Google Scholar

Copyright information

© Springer-Verlag London 2002

Authors and Affiliations

  1. 1.MS 269-1NASA Ames Research CenterMoffett FieldUSA

Personalised recommendations