In this paper we study a family of gradient descent algorithms to approximate the regression function from reproducing kernel Hilbert spaces (RKHSs), the family being characterized by a polynomial decreasing rate of step sizes (or learning rate). By solving a bias-variance trade-off we obtain an early stopping rule and some probabilistic upper bounds for the convergence of the algorithms. We also discuss the implication of these results in the context of classification where some fast convergence rates can be achieved for plug-in classifiers. Some connections are addressed with Boosting, Landweber iterations, and the online learning algorithms as stochastic approximations of the gradient descent method.
This is a preview of subscription content, log in to check access.
Buy single article
Instant access to the full article PDF.
Price includes VAT for USA
Subscribe to journal
Immediate online access to all issues from 2019. Subscription will auto renew annually.
This is the net price. Taxes to be calculated in checkout.
About this article
Cite this article
Yao, Y., Rosasco, L. & Caponnetto, A. On Early Stopping in Gradient Descent Learning. Constr Approx 26, 289–315 (2007). https://doi.org/10.1007/s00365-006-0663-2
- Convergence Rate
- Gradient Descent
- Tikhonov Regularization
- Reproduce Kernel Hilbert Space
- Gradient Descent Method