Constructive Approximation

, Volume 25, Issue 1, pp 1–27

The Entropy in Learning Theory. Error Estimates

• S.V. Konyagin
• V.N. Temlyakov
Article

Abstract

We continue the investigation of some problems in learning theory in the setting formulated by F. Cucker and S. Smale. The goal is to find an estimator $$f_{\bf z}$$ on the base of given data $$\mbox{\footnotesize\bf z}:=((x_1,y_1),\dots,(x_m,y_m))$$ that approximates well the regression function $$f_\rho$$ of an unknown Borel probability measure $$\rho$$ defined on $$Z=X\times Y.$$ We assume that $$f_\rho$$ belongs to a function class $$\Theta.$$ It is known from previous works that the behavior of the entropy numbers $$\epsilon_n(\Theta,{\cal C})$$ of $$\Theta$$ in the uniform norm $${\cal C}$$ plays an important role in the above problem. The standard way of measuring the error between a target function $$f_\rho$$ and an estimator $$f_{\bf z}$$ is to use the $$L_2(\rho_X)$$ norm ($$\rho_X$$ is the marginal probability measure on X generated by $$\rho$$). This method has been used in previous papers. We continue to use this method in this paper. The use of the $$L_2(\rho_X)$$ norm in measuring the error has motivated us to study the case when we make an assumption on the entropy numbers $$\epsilon_n(\Theta,L_2(\rho_X))$$ of $$\Theta$$ in the $$L_2(\rho_X)$$ norm. This is the main new ingredient of thispaper. We construct good estimators in different settings: (1) we know both $$\Theta$$ and $$\rho_X$$; (2) we know $$\Theta$$ but we do not know $$\rho_X;$$ and (3) we only know that $$\Theta$$ is from a known collection of classes but we do not know $$\rho_X.$$ An estimator from the third setting is called a universal estimator.

Keywords

Error Estimate Lebesgue Measure Learn Theory Borel Probability Measure Uniform Norm
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.