From Model Selection to Adaptive Estimation

Birgé, Lucien; Massart, Pascal

doi:10.1007/978-1-4612-1880-7_4

Lucien Birgé⁴ &
Pascal Massart⁵

1523 Accesses
140 Citations
1 Altmetric

Abstract

Many different model selection information criteria can be found in the literature in various contexts including regression and density estimation. There is a huge amount of literature concerning this subject and we shall, in this paper, content ourselves to cite only a few typical references in order to illustrate our presentation. Let us just mention AIC, C _p, or C _L, BIC and MDL criteria proposed by Akaike (1973), Mallows (1973), Schwarz (1978), and Rissanen (1978) respectively. These methods propose to select among a given collection of parametric models that model which minimizes an empirical loss (typically squared error or minus log-likelihood) plus some penalty term which is proportional to the dimension of the model. From one criterion to another the penalty functions differ by factors of log n, where n represents the number of observations.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Akaike, H. (1973), Information theory and an extension of the maximum likelihood principle, in P. N. Petrov & F. Csaki, eds, ‘Proceedings 2nd International Symposium on Information Theory’, Akademia Kiado, Budapest, pp. 267–281.
Google Scholar
Barron, A. R. & Cover, T. M. (1991), ‘Minimum complexity density estimation’, IEEE Transactions on Information Theory 37 1034–1054.
Article MathSciNet MATH Google Scholar
Barron, A. R., Birgé, L. & Massart, P. (1995), Model selection via penalization, Technical Report 95.54, Université Paris-Sud.
Google Scholar
Birgé, L. & Massart, P. (1994), Minimum contrast estimation on sieves, Technical Report 94.34, Université Paris-Sud.
Google Scholar
Cirel’son, B. S., Ibragimov, I. A. & Sudakov, V. N. (1976), Norm of gaussian sample function, in ‘Proceedings of the 3rd Japan-USSR Symposium on Probability Theory’, Springer-Verlag, New York, pp. 20–41. Springer Lecture Notes in Mathematics 550.
Google Scholar
DeVore, R. A. & Lorentz, G. G. (1993), Constructive Approximation, Springer-Verlag, Berlin.
Book MATH Google Scholar
Donoho, D. L. & Johnstone, I. M. (1994), ‘Ideal spatial adaptation by wavelet shrinkage’, Biometrika 81 425–455.
Article MathSciNet MATH Google Scholar
Donoho, D. L., Johnstone, I. M., Kerkyacharian, G. & Picard, D. (1993), Density estimation by wavelet thresholding, Technical Report 426, Department of Statistics, Stanford University.
Google Scholar
Efroimovich, S. Y. (1985), ‘Nonparametric estimation of a density of unknown smoothness’, Theory of Probability and Its Applications 30 557–568.
Article MathSciNet Google Scholar
Grenander, U. (1981), Abstract Inference, Wiley, New-York.
MATH Google Scholar
Kerkyacharian, G. & Picard, D. (1992), ‘Estimation de densité par méthode de noyau et d’ondelettes: les lieus entre la géometrie du noyau et les contraintes de régularité’, Comptes Rendus de l’Academie des Sciences, Paris, Ser. I Math 315, 79–84.
Google Scholar
Kerkyacharian, G., Picard, D. & Tribouley, K. (1994), LP adaptive density estimation, Technical report, Université Paris VII.
Google Scholar
Le Cam, L. (1973), ‘Convergence of estimates under dimensionality restrictions’, Annals of Statistics 19, 633–667.
Google Scholar
Le Cam, L. (1986), Asymptotic Methods in Statistical Decision Theory, Springer-Verlag, New York.
Book MATH Google Scholar
Ledoux, M. (1995). Private communication.
Google Scholar
Li, K. C. (1987), ‘Asymptotic optimality for C _p, C _L, cross-validation, and generalized cross-validation: Discrete index set’, Annals of Statistics 15, 958–975.
Article MathSciNet MATH Google Scholar
Mallows, C. L. (1973), ‘Some comments on C _p ’, Technometrics 15, 661–675.
MATH Google Scholar
Mason, D. M. & van Zwet, W. R. (1987), ‘A refinement of the KMT inequality for the uniform empirical process’, Annals of Probability 15, 871–884.
Article MathSciNet MATH Google Scholar
Meyer, Y. (1990), Ondelettes et Opérateurs I, Hermann, Paris.
MATH Google Scholar
Polyak, B. T. & Tsybakov, A. B. (1990), ‘Asymptotic optimality of the C_r-criteria in regression projective estimation’, Theory of Probability and Its Applications 35, 293–306.
Article MathSciNet MATH Google Scholar
Rissanen, J. (1978), ‘Modeling by shortest data description’, Automatica 14, 465–471.
Article MATH Google Scholar
Rudemo, M. (1982), ‘Empirical choice of histograms and kernel density estimators’, Scandinavian Journal of Statistics 9, 65–78.
MathSciNet MATH Google Scholar
Schwarz, G. (1978), ‘Estimating the dimension of a model’, Annals of Statistics 6, 461–464.
Article MathSciNet MATH Google Scholar
Talagrand, M. (1994), ‘Sharper bounds for Gaussian and empirical processes’, Annals of Probability 22 28–76.
Article MathSciNet MATH Google Scholar
Talagrand, M. (1995), New concentration inequalities in product spaces, Technical report, Ohio State University.
Google Scholar
Vapnik, V. (1982), Estimation of Dependences Based on Empirical Data, Springer-Verlag, New York.
MATH Google Scholar
Wahba, G. (1990), Spline Models for Observational Data, Society for Industrial and Applied Mathematics, Philadelphia.
Book MATH Google Scholar
Whittaker, E. T. & Watson, G. N. (1927), A Course of Modern Analysis, Cambridge University Press, London.
MATH Google Scholar

Download references

Author information

Authors and Affiliations

Université Paris VI and URA CNRS 1321, Paris, Frankreich
Lucien Birgé
Université Paris Sud and URA CNRS 743, Paris, Frankreich
Pascal Massart

Authors

Lucien Birgé
View author publications
You can also search for this author in PubMed Google Scholar
Pascal Massart
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Statistics, Yale University, New Haven, CT, 06520, USA
David Pollard
Department of Mathematics, University of Oslo, Blindern, Oslo 3, Norway
Erik Torgersen
Department of Mathematics, University of Maryland, College Park, MD, 20742, USA
Grace L. Yang

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Birgé, L., Massart, P. (1997). From Model Selection to Adaptive Estimation. In: Pollard, D., Torgersen, E., Yang, G.L. (eds) Festschrift for Lucien Le Cam. Springer, New York, NY. https://doi.org/10.1007/978-1-4612-1880-7_4

Download citation

DOI: https://doi.org/10.1007/978-1-4612-1880-7_4
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4612-7323-3
Online ISBN: 978-1-4612-1880-7
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics