, Volume 31, Issue 2, pp 155–171 | Cite as

Complexity control in statistical learning

  • Sameer M. Jalnapurkar


We consider the problem of determining a model for a given system on the basis of experimental data. The amount of data available is limited and, further, may be corrupted by noise. In this situation, it is important to control thecomplexity of the class of models from which we are to choose our model. In this paper, we first give a simplified overview of the principal features of learning theory. Then we describe how the method of regularization is used to control complexity in learning. We discuss two examples of regularization, one in which the function space used is finite dimensional, and another in which it is a reproducing kernel Hilbert space. Our exposition follows the formulation of Cucker and Smale. We give a new method of bounding the sample error in the regularization scenario, which avoids some difficulties in the derivation given by Cucker and Smale.


Complexity control learning theory regularisation covering number 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Burnham K P, Anderson D 2002Model selection and multi-model inference (Springer-Verlag)Google Scholar
  2. Cucker F, Smale S 2002a Best choices for regularization parameters in learning theory: On the bias-variance problem.Found. Comput. Math. 2: 413–428zbMATHCrossRefMathSciNetGoogle Scholar
  3. Cucker F, Smale S 2002b On the mathematical foundations of learning.Bull. Am. Math. Soc. 39: 1–49zbMATHCrossRefMathSciNetGoogle Scholar
  4. DeVito E, Rosasco L, Caponnetto A, DeGiovannini U, Odone F 2005 Learning from examples as an inverse problem.J. Mach. Learn. Res. 6: 883–904MathSciNetGoogle Scholar
  5. Evgeniou T, Pontil M, Poggio T 2000 Regularization networks and support vector machines.Adv. Comput. Math. 13: 1–50zbMATHCrossRefMathSciNetGoogle Scholar
  6. Hastie T, Tibshirani R, Friedman J 2001The elements of statistical learning: data mining, inference and prediction (New York: Springer-Verlag)zbMATHGoogle Scholar
  7. Karandikar R L, Vidyasagar M 2002 Rates of uniform convergence of empirical means with mixing processes.Stat. Probab. Lett. 58: 297–307zbMATHCrossRefMathSciNetGoogle Scholar
  8. Schölkopf B, Smola A 2002Learning with kernels: Support vector machines, regularization, optimization and beyond (Cambridge, MA: MIT Press)Google Scholar
  9. Sin C-Y, White H 1996 Information criteria for selecting possibly misspecified parametric models.J. Econometrics 71: 207–225zbMATHCrossRefMathSciNetGoogle Scholar
  10. Smale S, Zhou D 2005 Learning theory estimates via integral operators and their approximations. Preprint available at http://www.tti-c.orgsmale-papers/sampIII5412.pdfGoogle Scholar
  11. Vapnik V 1998Statistical learning theory (New York: John Wiley & Sons)zbMATHGoogle Scholar
  12. Vidyasagar M 1997 Atheory of learning and generalization (Berlin: Springer-Verlag)zbMATHGoogle Scholar

Copyright information

© Indian Academy of Sciences 2006

Authors and Affiliations

  • Sameer M. Jalnapurkar
    • 1
  1. 1.Department of MathematicsIndian Institute of ScienceBangaloreIndia

Personalised recommendations