Skip to main content

From Model Selection to Adaptive Estimation

  • Chapter
Festschrift for Lucien Le Cam

Abstract

Many different model selection information criteria can be found in the literature in various contexts including regression and density estimation. There is a huge amount of literature concerning this subject and we shall, in this paper, content ourselves to cite only a few typical references in order to illustrate our presentation. Let us just mention AIC, C p , or C L , BIC and MDL criteria proposed by Akaike (1973), Mallows (1973), Schwarz (1978), and Rissanen (1978) respectively. These methods propose to select among a given collection of parametric models that model which minimizes an empirical loss (typically squared error or minus log-likelihood) plus some penalty term which is proportional to the dimension of the model. From one criterion to another the penalty functions differ by factors of log n, where n represents the number of observations.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Akaike, H. (1973), Information theory and an extension of the maximum likelihood principle, in P. N. Petrov & F. Csaki, eds, ‘Proceedings 2nd International Symposium on Information Theory’, Akademia Kiado, Budapest, pp. 267–281.

    Google Scholar 

  • Barron, A. R. & Cover, T. M. (1991), ‘Minimum complexity density estimation’, IEEE Transactions on Information Theory 37 1034–1054.

    Article  MathSciNet  MATH  Google Scholar 

  • Barron, A. R., Birgé, L. & Massart, P. (1995), Model selection via penalization, Technical Report 95.54, Université Paris-Sud.

    Google Scholar 

  • Birgé, L. & Massart, P. (1994), Minimum contrast estimation on sieves, Technical Report 94.34, Université Paris-Sud.

    Google Scholar 

  • Cirel’son, B. S., Ibragimov, I. A. & Sudakov, V. N. (1976), Norm of gaussian sample function, in ‘Proceedings of the 3rd Japan-USSR Symposium on Probability Theory’, Springer-Verlag, New York, pp. 20–41. Springer Lecture Notes in Mathematics 550.

    Google Scholar 

  • DeVore, R. A. & Lorentz, G. G. (1993), Constructive Approximation, Springer-Verlag, Berlin.

    Book  MATH  Google Scholar 

  • Donoho, D. L. & Johnstone, I. M. (1994), ‘Ideal spatial adaptation by wavelet shrinkage’, Biometrika 81 425–455.

    Article  MathSciNet  MATH  Google Scholar 

  • Donoho, D. L., Johnstone, I. M., Kerkyacharian, G. & Picard, D. (1993), Density estimation by wavelet thresholding, Technical Report 426, Department of Statistics, Stanford University.

    Google Scholar 

  • Efroimovich, S. Y. (1985), ‘Nonparametric estimation of a density of unknown smoothness’, Theory of Probability and Its Applications 30 557–568.

    Article  MathSciNet  Google Scholar 

  • Grenander, U. (1981), Abstract Inference, Wiley, New-York.

    MATH  Google Scholar 

  • Kerkyacharian, G. & Picard, D. (1992), ‘Estimation de densité par méthode de noyau et d’ondelettes: les lieus entre la géometrie du noyau et les contraintes de régularité’, Comptes Rendus de l’Academie des Sciences, Paris, Ser. I Math 315, 79–84.

    Google Scholar 

  • Kerkyacharian, G., Picard, D. & Tribouley, K. (1994), LP adaptive density estimation, Technical report, Université Paris VII.

    Google Scholar 

  • Le Cam, L. (1973), ‘Convergence of estimates under dimensionality restrictions’, Annals of Statistics 19, 633–667.

    Google Scholar 

  • Le Cam, L. (1986), Asymptotic Methods in Statistical Decision Theory, Springer-Verlag, New York.

    Book  MATH  Google Scholar 

  • Ledoux, M. (1995). Private communication.

    Google Scholar 

  • Li, K. C. (1987), ‘Asymptotic optimality for C p, C L , cross-validation, and generalized cross-validation: Discrete index set’, Annals of Statistics 15, 958–975.

    Article  MathSciNet  MATH  Google Scholar 

  • Mallows, C. L. (1973), ‘Some comments on C p , Technometrics 15, 661–675.

    MATH  Google Scholar 

  • Mason, D. M. & van Zwet, W. R. (1987), ‘A refinement of the KMT inequality for the uniform empirical process’, Annals of Probability 15, 871–884.

    Article  MathSciNet  MATH  Google Scholar 

  • Meyer, Y. (1990), Ondelettes et Opérateurs I, Hermann, Paris.

    MATH  Google Scholar 

  • Polyak, B. T. & Tsybakov, A. B. (1990), ‘Asymptotic optimality of the Cr-criteria in regression projective estimation’, Theory of Probability and Its Applications 35, 293–306.

    Article  MathSciNet  MATH  Google Scholar 

  • Rissanen, J. (1978), ‘Modeling by shortest data description’, Automatica 14, 465–471.

    Article  MATH  Google Scholar 

  • Rudemo, M. (1982), ‘Empirical choice of histograms and kernel density estimators’, Scandinavian Journal of Statistics 9, 65–78.

    MathSciNet  MATH  Google Scholar 

  • Schwarz, G. (1978), ‘Estimating the dimension of a model’, Annals of Statistics 6, 461–464.

    Article  MathSciNet  MATH  Google Scholar 

  • Talagrand, M. (1994), ‘Sharper bounds for Gaussian and empirical processes’, Annals of Probability 22 28–76.

    Article  MathSciNet  MATH  Google Scholar 

  • Talagrand, M. (1995), New concentration inequalities in product spaces, Technical report, Ohio State University.

    Google Scholar 

  • Vapnik, V. (1982), Estimation of Dependences Based on Empirical Data, Springer-Verlag, New York.

    MATH  Google Scholar 

  • Wahba, G. (1990), Spline Models for Observational Data, Society for Industrial and Applied Mathematics, Philadelphia.

    Book  MATH  Google Scholar 

  • Whittaker, E. T. & Watson, G. N. (1927), A Course of Modern Analysis, Cambridge University Press, London.

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1997 Springer Science+Business Media New York

About this chapter

Cite this chapter

Birgé, L., Massart, P. (1997). From Model Selection to Adaptive Estimation. In: Pollard, D., Torgersen, E., Yang, G.L. (eds) Festschrift for Lucien Le Cam. Springer, New York, NY. https://doi.org/10.1007/978-1-4612-1880-7_4

Download citation

  • DOI: https://doi.org/10.1007/978-1-4612-1880-7_4

  • Publisher Name: Springer, New York, NY

  • Print ISBN: 978-1-4612-7323-3

  • Online ISBN: 978-1-4612-1880-7

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics