Statistical Learning Theory: A Primer

  • Theodoros Evgeniou
  • Massimiliano Pontil
  • Tomaso Poggio


In this paper we first overview the main concepts of Statistical Learning Theory, a framework in which learning from examples can be studied in a principled way. We then briefly discuss well known as well as emerging learning techniques such as Regularization Networks and Support Vector Machines which can be justified in term of the same induction principle.

VC-dimension structural risk minimization regularization networks support vector machines 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Alon, N., Ben-David, S., Cesa-Bianchi, N., and Haussler, D. 1993. Scale-sensitive dimensions, uniform convergence, and learnability. Symposium on Foundations of Computer Science.Google Scholar
  2. Cortes, C. and Vapnik, V. 1995. Support vector networks. Machine Learning, 20:1–25.Google Scholar
  3. Devroye, L., Györfi, L., and Lugosi, G. 1996. A Probabilistic Theory of Pattern Recognition, No. 31 in Applications of Mathematics. Springer: New York.Google Scholar
  4. Evgeniou, T., Pontil, M., Papageorgiou, C., and Poggio, T. 2000. Image representations for object detection using kernel classifiers. In Proceedings ACCV. Taiwan, p. To appear.Google Scholar
  5. Evgeniou, T., Pontil, M., and Poggio, T. 1999. A unified framework for Regularization Networks and Support Vector Machines. A.I. Memo No. 1654, Artificial Intelligence Laboratory, Massachusetts Institute of Technology.Google Scholar
  6. Ezzat, T. and Poggio, T. 1996. Facial analysis and synthesis using image-based models. In Face and Gesture Recognition. pp. 116–121.Google Scholar
  7. Girosi, F., Jones, M., and Poggio, T. 1995. Regularization theory and neural networks architectures. Neural Computation, 7:219–269.Google Scholar
  8. Jaakkola, T. and Haussler, D. 1998. Probabilistic kernel regression models. In Proc. of Neural Information Processing Conference.Google Scholar
  9. Kearns, M. and Shapire, R. 1994. Efficient distribution-free learning of probabilistic concepts. Journal of Computer and Systems Sciences, 48(3):464–497.Google Scholar
  10. Mohan, A. 1999. Robust object detection in images by components. Master's Thesis, Massachusetts Institute of Technology.Google Scholar
  11. Osuna, E., Freund, R., and Girosi, F. 1997. An improved training algorithm for support vector machines. In IEEEWorkshop on Neural Networks and Signal Processing, Amelia Island, FL.Google Scholar
  12. Papageorgiou, C., Oren, M., and Poggio, T. 1998. A general framework for object detection. In Proceedings of the International Conference on Computer Vision, Bombay, India.Google Scholar
  13. Platt, J.C. 1998. Sequential minimal imization: A fast algorithm for training support vector machines. Technical Report MST-TR-98-14, Microsoft Research.Google Scholar
  14. Tikhonov, A.N. and Arsenin, V.Y. 1977. Solutions of Ill-posed Problems. Washington, D.C.: W.H. Winston.Google Scholar
  15. Vapnik, V.N. 1998. Statistical Learning Theory. Wiley: New York.Google Scholar
  16. Vapnik, V.N. and Chervonenkis, A.Y. 1971. On the uniform convergence of relative frequences of events to their probabilities. Th. Prob. and its Applications, 17(2):264–280.Google Scholar
  17. Wahba, G. 1990. Splines Models for Observational Data. Vol. 59, Series in Applied Mathematics: Philadelphia.Google Scholar

Copyright information

© Kluwer Academic Publishers 2000

Authors and Affiliations

  • Theodoros Evgeniou
    • 1
  • Massimiliano Pontil
    • 1
  • Tomaso Poggio
    • 1
  1. 1.Center for Biological and Computational Learning, Artificial Intelligence LaboratoryMITCambridgeUSA

Personalised recommendations