Skip to main content

Mathematical Modelling of Generalization

  • Conference paper
  • First Online:
Neural Nets (WIRN 2002)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2486))

Included in the following conference series:

  • 917 Accesses

Abstract

This paper surveys certain developments in the use of probabilistic techniques for the modelling of generalization. Some of the main methods and key results are discussed. Many details are omitted, the aim being to give a high-level overview of the types of approaches taken and methods used.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Noga Alon, Shai Ben-David, Nicolo Cesa-Bianchi, and David Haussler: Scale-sensitive dimensions, uniform convergence, and learnability. Journal of the ACM 44(5): 616–631.

    Google Scholar 

  2. Martin Anthony: Probabilistic analysis of learning in artificial neural networks: the PAC model and its variants. Neural Computing Surveys, 1, 1997.

    Google Scholar 

  3. Martin Anthony and Peter L. Bartlett: Neural Network Learning: Theoretical Foundations. Cambridge University Press, Cambridge UK, 1999.

    MATH  Google Scholar 

  4. Martin Anthony and Norman L. Biggs: Computational Learning Theory: An Introduction. Cambridge Tracts in Theoretical Computer Science, 30, 1992. Cambridge University Press, Cambridge, UK.

    Google Scholar 

  5. András Antos, Balázs Kégl, Tamás Linder and Gábor Lugosi: Data-dependent margin-based generalization bounds for classification. Preprint, Queen’s University at Kingston, Canada, magenta.mast.queensu.ca/~linder/preprints.html.

    Google Scholar 

  6. Peter Bartlett: The sample complexity of pattern classification with neural networks: the size of the weights is more important than the size of the network. IEEE Transactions on Information Theory 44(2): 525–536.

    Google Scholar 

  7. Peter L. Bartlett, Olivier Bousquet and Shahar Mendelson: Localized Rademacher complexities. To appear, Proceedings of the 15th Annual Conference on Computational Learning Theory, ACM Press, New York, NY, 2002.

    Google Scholar 

  8. Peter L. Bartlett and Philip M. Long: More theorems about scale-sensitive dimensions and learning. In Proceedings of the 8th Annual Conference on Computational Learning Theory, ACM Press, New York, NY, 1995, pp. 392–401.

    Chapter  Google Scholar 

  9. Peter Bartlett and Shahar Mendelson: Rademacher and Guassian complexities: risk bounds and structural results. In Proceedings of the 14th Annual Conference on Computational Learning Theory, Lecture Notes in Artificial Intelligence, Springer pp. 224–240, 2001.

    Google Scholar 

  10. Anselm Blumer, Andrzej Ehrenfeucht, David Haussler, and Manfred K. Warmuth: Learnability and the Vapnik-Chervonenkis dimension. Journal of the ACM, 36(4): 929–965, 1989.

    Article  MATH  MathSciNet  Google Scholar 

  11. Stéphane Boucheron, Gábor Lugosi and Pascal Massart: A sharp concentration inequality with applications. Random Structures and Algorithms, 16: 277–292, 2000.

    Article  MATH  MathSciNet  Google Scholar 

  12. Olivier Bousquet, Vladimir Koltchinskii and Dmitriy Panchenko: Some local measures of complexity on convex hulls and generalization bounds. To appear, Proceedings of the 15th Annual Conference on Computational Learning Theory, ACM Press, New York, NY, 2002.

    Google Scholar 

  13. Nello Cristianini and John Shawe-Taylor: An Introduction to Support Vector Machines, Cambridge University Press, Cambridge, UK, 2000.

    Google Scholar 

  14. Luc Devroye and Gábor Lugosi: Combinatorial Methods in Density Estimation, Springer Series in Statistics, Springer-Verlag, New York, NY, 2001.

    MATH  Google Scholar 

  15. Richard M. Dudley: Uniform Central Limit Theorems, Cambridge Studies in Advanced Mathematics, 63, Cambridge University Press, Cambridge, UK, 1999.

    MATH  Google Scholar 

  16. Richard M. Dudley: Central limit theorems for empirical measures. Annals of Probability, 6(6): 899–929, 1978.

    Article  MATH  MathSciNet  Google Scholar 

  17. Andrzej Ehrenfeucht, David Haussler, Michael Kearns, and Leslie Valiant. A general lower bound on the number of examples needed for learning. Information and Computation, 82: 247–261, 1989.

    Article  MATH  MathSciNet  Google Scholar 

  18. E. Giné and J. Zinn: Some limit theorems for empirical processes. Annals of Probability 12(4): 929–989, 1984.

    Article  MATH  MathSciNet  Google Scholar 

  19. David Haussler: Decision theoretic generalizations of the PAC model for neural net and other learning applications. Information and Computation, 100(1): 78–150, 1992.

    Article  MATH  MathSciNet  Google Scholar 

  20. Marek Karpinski and Angus MacIntyre: Polynomial bounds for VC dimension of sigmoidal and general Pfaffian neural networks. Journal of Computer and System Sciences, 54: 169–176, 1997.

    Article  MATH  MathSciNet  Google Scholar 

  21. Michael J. Kearns and Umesh Vazirani: Introduction to Computational Learning Theory, MIT Press, Cambridge, MA, 1995.

    Google Scholar 

  22. Vladimir Koltchinskii and Dmitry Panchenko: Rademacher processes and bounding the risk of function learning. Technical report, Department of Mathematics and Statistics, University of New Mexico, 2000.

    Google Scholar 

  23. Gábor Lugosi: Lectures on Statistical Learning Theory, presented at the Garchy Seminar on Mathematical Statistics and Applications, August 27–September 1, 2000. (Availablefromhttp://www.econ.upf.es/lugosi.)

  24. Colin McDiarmid: On the method of bounded differences. In J. Siemons, editor, Surveys in Combinatorics, 1989, London Mathematical Society Lecture Note Series (141). Cambridge University Press, Cambridge, UK, 1989.

    Google Scholar 

  25. Shahar Mendelson: A few notes on Statistical Learning Theory. Technical Report, Australian National University Computer Science Laboratory.

    Google Scholar 

  26. S. Mendelson and R. Vershynin: Entropy, dimension and the Elton-Pajor theorem. Preprint, Australian National University.

    Google Scholar 

  27. David Pollard: Convergence of Stochastic Processes. Springer-Verlag, 1984.

    Google Scholar 

  28. N. Sauer: On the density of families of sets. Journal of Combinatorial Theory (A), 13: 145–147, 1972.

    Article  MATH  MathSciNet  Google Scholar 

  29. S. Shelah: A combinatorial problem: Stability and order for models and theories in infinitary languages. Pacific Journal of Mathematics, 41: 247–261, 1972.

    MATH  MathSciNet  Google Scholar 

  30. John Shawe-Taylor, Peter Bartlett, Bob Williamson and Martin Anthony: Structural risk minimisation over data-dependent hierarchies. IEEE Transactions on Information Theory, 44(5): 1926–1940, 1998.

    Article  MATH  MathSciNet  Google Scholar 

  31. Aad W. van der Vaart and Jon A. Wellner: Weak Convergence and Empirical Processes, Springer Series in Statistics, Springer-Verlag, New York, NY, 1996.

    MATH  Google Scholar 

  32. Leslie G. Valiant: A theory of the learnable. Communications of the ACM, 27(11): 1134–1142, Nov. 1984.

    Google Scholar 

  33. Vladimir N. Vapnik: Estimation of Dependences Based on Empirical Data. Springer-Verlag, New York, 1982.

    MATH  Google Scholar 

  34. Vladimir N. Vapnik: Statistical Learning Theory, Wiley, 1998.

    Google Scholar 

  35. V. N. Vapnik and A. Y. Chervonenkis: On the uniform convergence of relative frequencies of events to their probabilities. Theory of Probability and its Applications, 16(2): 264–280, 1971.

    Article  MathSciNet  MATH  Google Scholar 

  36. M. Vidyasagar: A Theory of Learning and Generalization, Springer-Verlag, 1996.

    Google Scholar 

  37. Robert Williamson, John Shawe-Taylor, Bernhard Scholkopf, and Alex Smola: Sample Based Generalization Bounds, NeuroCOLT Technical Report, NC-TR-99-055, 1999.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2002 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Anthony, M. (2002). Mathematical Modelling of Generalization. In: Marinaro, M., Tagliaferri, R. (eds) Neural Nets. WIRN 2002. Lecture Notes in Computer Science, vol 2486. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45808-5_20

Download citation

  • DOI: https://doi.org/10.1007/3-540-45808-5_20

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-44265-3

  • Online ISBN: 978-3-540-45808-1

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics