Skip to main content

Advertisement

SpringerLink
Log in
Menu
Find a journal Publish with us
Search
Cart
  1. Home
  2. Probability Theory and Related Fields
  3. Article
Risk bounds for model selection via penalization
Download PDF
Download PDF
  • Published: February 1999

Risk bounds for model selection via penalization

  • Andrew Barron1,
  • Lucien Birgé2 &
  • Pascal Massart3 

Probability Theory and Related Fields volume 113, pages 301–413 (1999)Cite this article

  • 1487 Accesses

  • 380 Citations

  • Metrics details

Abstract

Performance bounds for criteria for model selection are developed using recent theory for sieves. The model selection criteria are based on an empirical loss or contrast function with an added penalty term motivated by empirical process theory and roughly proportional to the number of parameters needed to describe the model divided by the number of observations. Most of our examples involve density or regression estimation settings and we focus on the problem of estimating the unknown density or regression function. We show that the quadratic risk of the minimum penalized empirical contrast estimator is bounded by an index of the accuracy of the sieve. This accuracy index quantifies the trade-off among the candidate models between the approximation error and parameter dimension relative to sample size.

If we choose a list of models which exhibit good approximation properties with respect to different classes of smoothness, the estimator can be simultaneously minimax rate optimal in each of those classes. This is what is usually called adaptation. The type of classes of smoothness in which one gets adaptation depends heavily on the list of models. If too many models are involved in order to get accurate approximation of many wide classes of functions simultaneously, it may happen that the estimator is only approximately adaptive (typically up to a slowly varying function of the sample size).

We shall provide various illustrations of our method such as penalized maximum likelihood, projection or least squares estimation. The models will involve commonly used finite dimensional expansions such as piecewise polynomials with fixed or variable knots, trigonometric polynomials, wavelets, neural nets and related nonlinear expansions defined by superposition of ridge functions.

Download to read the full article text

Working on a manuscript?

Avoid the common mistakes

Author information

Authors and Affiliations

  1. Department of Statistics, Yale University, P.O. Box 208290, New Haven, CT 06520-8290, USA. e-mail: barron@stat.yale.edu, , , , , , US

    Andrew Barron

  2. URA 1321 “Statistique et modèles aléatoires”, Laboratoire de Probabilités, boîte 188, Université Paris VI, 4 Place Jussieu, F-75252 Paris Cedex 05, France. e-mail: lb@ccr.jussieu.fr, , , , , , FR

    Lucien Birgé

  3. URA 743 “Modélisation stochastique et Statistique”, Bât. 425, Université Paris Sud, Campus d'Orsay, F-91405 Orsay Cedex, France. e-mail: massart@stats.matups.fr, , , , , , FR

    Pascal Massart

Authors
  1. Andrew Barron
    View author publications

    You can also search for this author in PubMed Google Scholar

  2. Lucien Birgé
    View author publications

    You can also search for this author in PubMed Google Scholar

  3. Pascal Massart
    View author publications

    You can also search for this author in PubMed Google Scholar

Additional information

Received: 7 July 1995 / Revised version: 1 November 1997

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Barron, A., Birgé, L. & Massart, P. Risk bounds for model selection via penalization. Probab. Theory Relat. Fields 113, 301–413 (1999). https://doi.org/10.1007/s004400050210

Download citation

  • Issue Date: February 1999

  • DOI: https://doi.org/10.1007/s004400050210

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Mathematics subject classifications (1991): Primary 62G05, 62G07; secondary 41A25
  • Key words and phrases: Penalization – Model selection – Adaptive estimation – Empirical processes – Sieves – Minimum contrast estimators
Download PDF

Working on a manuscript?

Avoid the common mistakes

Advertisement

Search

Navigation

  • Find a journal
  • Publish with us

Discover content

  • Journals A-Z
  • Books A-Z

Publish with us

  • Publish your research
  • Open access publishing

Products and services

  • Our products
  • Librarians
  • Societies
  • Partners and advertisers

Our imprints

  • Springer
  • Nature Portfolio
  • BMC
  • Palgrave Macmillan
  • Apress
  • Your US state privacy rights
  • Accessibility statement
  • Terms and conditions
  • Privacy policy
  • Help and support

167.114.118.210

Not affiliated

Springer Nature

© 2023 Springer Nature