Abstract
We investigate the behavior of the empirical minimization algorithm using various methods. We first analyze it by comparing the empirical, random, structure and the original one on the class, either in an additive sense, via the uniform law of large numbers, or in a multiplicative sense, using isomorphic coordinate projections. We then show that a direct analysis of the empirical minimization algorithm yields a significantly better bound, and that the estimates we obtain are essentially sharp. The method of proof we use is based on Talagrand's concentration inequality for empirical processes.
Article PDF
Similar content being viewed by others
References
Bartlett, P.L., Boucheron, S., Lugosi, G.: Model selection and error estimation. Machine Learning 48, 85–113 (2002)
Peter L. Bartlett, Olivier Bousquet, Shahar Mendelson: Local Rademacher complexities. Annals of Statistics, 33, 1497–1537(2005)
Peter L. Bartlett, Michael I. Jordan, Jon D. McAuliffe: Convexity, classification, and risk bounds. Journal of the American Statistical Association. 2005. To appear.
Blumer, A., Ehrenfeucht, A., Haussler, D., Warmuth, M.K.: Learnability and the Vapnik-Chervonenkis dimension. J. ACM 36 (4), 929–965 (1989)
Bousquet, O.: Concentration Inequalities and Empirical Processes Theory Applied to the Analysis of Learning Algorithms. PhD thesis, Department of Applied Mathematics, Ecole Polytechnique, 2002
Dudley, R.M.: Uniform Central Limit Theorems. Cambridge University Press, 1999
Giné, E., Koltchinskii, V., Wellner, J.A.: Ratio limit theorems for empirical processes. In: E. Giné, C. Houdré, D. Nualart, (eds.), Statistical Inequalities and Applications, 2003, pp. 249–278
Haussler, D.: Sphere packing numbers for subsets of the Boolean n-cube with bounded Vapnik-Chervonenkis dimension. J. Combinatorial Theory, Series A 69 (2), 217–232 (1995)
Johnson, W.B., Schechtman, G.: Finite dimensional subspaces of ℓ p . In: W.B. Johnson, J. Lindenstrauss, (eds.), Handbook of the Geometry of Banach Spaces, Vol 1. North Holland, 2001
Klein, T.: Une inégalité de concentration gauche pour les processus empiriques. [A left concentration inequality for empirical processes]. C. R. Math. Acad. Sci. Paris 334 (6), 501–504 (2002)
Koltchinskii, V.: Rademacher penalties and structural risk minimization. IEEE Transactions on Information Theory 47 (5), 1902–1914 July 2001
Koltchinskii, V.I., Panchenko, D.: Rademacher processes and bounding the risk of function learning. In: E. Giné, D.M. Mason, J.A. Wellner, (eds.), High Dimensional Probability II, Vol 47, Birkhäuser, 2000, pp. 443–459
Koltchinskii, V.: Local Rademacher complexities and oracle inequalities in risk minimization. Technical report, University of New Mexico, 2003
Ledoux, M.: The Concentration of Measure Phenomenon, Vol 89. AMS, 2001
Lee, W.S., Bartlett, P.L., Williamson, R.C.: Efficient agnostic learning of neural networks with bounded fan-in. IEEE Transactions on Information Theory 42 (6), 2118–2132 (1996)
Lugosi, G., Wegkamp, M.: Complexity regularization via localized random penalties. Ann. Stat. 32 (4), 1679–1697 (2004)
Massart, P.: About the constants in Talagrand's concentration inequality. Ann. Probability 28, 863–885 (2000)
Massart, P.: Some applications of concentration inequalities to statistics. Annales de la Faculté des Sciences de Toulouse IX, 245–303 (2000)
Mendelson, S.: Improving the sample complexity using global data. IEEE Transactions on Information Theory 48 (7), 1977–1991 (2002)
Mendelson, S.: Rademacher averages and phase transitions in Glivenko-Cantelli classes. IEEE Transactions on Information Theory 48 (1), 251–263 (2002)
Mendelson, S.: A few notes on statistical learning theory. In: S. Mendelson, A.J. Smola, (eds), Advanced Lectures in Machine Learning, Vol 2600 of Lecture Notes in Computer Science, Springer, 2003, pp. 1–40
Mendelson, S.: On the performance of kernel classes. J. Machine Learning Res. 4, 759–771 (2003)
Mendelson, S.: Geometric parameters in learning theory. Geometric aspects of Functional Analysis (GAFA Seminar Notes). Lecture Notes in Mathematics 1850, 193–236 (2004)
Rio, E.: Inégalités de concentration pour les processus empiriques de classes de parties [Concentration inequalities for set-indexed empirical processes]. Probab. Theory Relat. Fields 119 (2), 163–175 (2001)
Talagrand, M.: Sharper bounds for Gaussian and empirical processes. Ann. Probability 22, 28–76 (1994)
van de Geer, S.: A new approach to least-squares estimation, with applications. Ann. Stat. 15, 587–602 (1987)
van de Geer, S.: Empirical Processes in M-Estimation. Cambridge University Press, 2000
van der Vaart, A.: Asymptotic Statistics. Cambridge University Press, 1998
van der Vaart, A.W., Wellner, J.A.: Weak Convergence and Empirical Processes. Springer, 1996
Vapnik, V.N., Chervonenkis, A.Y.: On the uniform convergence of relative frequencies of events to their probabilities. Theory of Probability and its Applications 16 (2), 264–280 (1971)
Author information
Authors and Affiliations
Corresponding author
Additional information
Research partially supported by NSF under award DMS-0434393.
Research partially supported by the Australian Research Council Discovery Porject DP0343616.
Rights and permissions
About this article
Cite this article
Bartlett, P., Mendelson, S. Empirical minimization. Probab. Theory Relat. Fields 135, 311–334 (2006). https://doi.org/10.1007/s00440-005-0462-3
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00440-005-0462-3