Abstract
Rates of convergence for empirical risk minimizers have been well studied in the literature. In this paper, we aim to provide a complementary set of results, in particular by showing that after normalization, the risk of the empirical minimizer concentrates on a single point. Such results have been established by Chatterjee (The Annals of Statistics, 42(6):2340–2381 2014) for constrained estimators in the normal sequence model. We first generalize and sharpen this result to regularized least squares with convex penalties, making use of a “direct” argument based on Borell’s theorem. We then study generalizations to other loss functions, including the negative log-likelihood for exponential families combined with a strictly convex regularization penalty. The results in this general setting are based on more “indirect” arguments as well as on concentration inequalities for maxima of empirical processes.
Similar content being viewed by others
References
Borell, C. (1975). The brunn-Minkowski inequality in Gauss space. Inventiones Mathematicae 30, 2, 207–216.
Boucheron, S. and Massart, P. (2011). A high-dimensional Wilks phenomenon. Probability Theory and Related Fields 150, 3-4, 405–433.
Boucheron, S., Lugosi, G. and Massart, P. (2013). Concentration inequalities: A nonasymptotic theory of independence. OUP Oxford.
Chatterjee, S. (2014). A new perspective on least squares under convex constraint. The Annals of Statistics 42, 6, 2340–2381.
Klein, T. (2002). Une inégalité de concentration à gauche pour les processus empiriques. Comptes Rendus Mathematique 334, 6, 501–504.
Klein, T. and Rio, E. (2005). Concentration around the mean for maxima of empirical processes. The Annals of Probability 33, 3, 1060–1077.
Koltchinskii, V. (2011). Oracle Inequalities in Empirical Risk Minimization and Sparse Recovery Problems: Ecole dEté de probabilités de Saint-Flour XXXVIII-2008, volume 38 Springer Science & Business Media.
Ledoux, M. (2001). The concentration of measure phenomenon, volume 89. American Mathematical Society.
Massart, P. (2000). Some applications of concentration inequalities to statistics. Annales de la faculté des sciences de toulouse: Mathématiques, volume 9, pages 245–303.
Muro, A. and van de Geer, S. (2015). Concentration behavior of the penalized least squares estimator. arXiv:1511.08698.
Rockafellar, R.T. (1970). Convex analysis. Princeton University Press.
Saumard, A. (2012). Optimal upper and lower bounds for the true and empirical excess risks in heteroscedastic least-squares regression. Electronic Journal of Statistics 6, 579–655.
Talagrand, M. (1995). Concentration of measure and isoperimetric inequalities in product spaces. Publications Mathé,matiques de l’IHES 81, 73–205.
van der Vaart, A.W. and Wellner, J.A. (1996). Weak Convergence and Empirical Processes. Springer Series in Statistics. Springer-Verlag, New York. ISBN 0-387-94640-3.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
van de Geer, S., Wainwright, M.J. On Concentration for (Regularized) Empirical Risk Minimization. Sankhya A 79, 159–200 (2017). https://doi.org/10.1007/s13171-017-0111-9
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13171-017-0111-9
Keywords and phrases.
- Concentration
- Density estimation
- Empirical process
- Empirical risk minimization
- Normal sequence model
- Penalized least squares