Abstract
We study the problem of estimating the mean of a random vector in \(\mathbb {R}^d\) based on an i.i.d. sample, when the accuracy of the estimator is measured by a general norm on \(\mathbb {R}^d\). We construct an estimator (that depends on the norm) that achieves an essentially optimal accuracy/confidence tradeoff under the only assumption that the random vector has a well-defined covariance matrix. At the heart of the argument is the construction of a uniform median-of-means estimator in a class of real valued functions.
Similar content being viewed by others
Notes
In this article we focus on optimal orders of magnitude and ignore the—important-problem of optimizing constants.
Recall that a random vector X is L-sub-Gaussian if for every \(t \in \mathbb {R}^d\) and every \(p \ge 2\), \(\Vert \left\langle X-\mu ,t \right\rangle \Vert _{L_p} \le L \sqrt{p} \Vert \left\langle X-\mu ,t \right\rangle \Vert _{L_2}\).
Here and in what follows we identify linear functionals on \(\mathbb {R}^d\) with points in \(\mathbb {R}^d\), and the action of \(t \in \mathbb {R}^d\) is given by \(x^*(x)=\left\langle t,x \right\rangle \), that is, the standard inner product with t.
We thank the anonymous referee for pointing out this fact.
References
Alon, N., Matias, Y., Szegedy, M.: The space complexity of approximating the frequency moments. J. Comput. Syst. Sci. 58, 137–147 (2002)
Artstein, S., Milman, V., Szarek, S.J.: Duality of metric entropy. Ann. Math. 159, 1313–1328 (2004)
Boucheron, S., Lugosi, G., Massart, P.: Concentration Inequalities: A Nonasymptotic Theory of Independence. Oxford University Press, Oxford (2013)
Catoni, O.: Challenging the empirical mean and empirical variance: a deviation study. Ann. Inst. Henri Poincaré Probab. Stat. 48(4), 1148–1185 (2012)
O. Catoni.: PAC-Bayesian bounds for the Gram matrix and least squares regression with a random design. arXiv:1603.05229 (2016)
O. Catoni and I. Giulini.: Dimension-free PAC-Bayesian bounds for matrices, vectors, and linear least squares regression. arXiv:1712.02747 (2017)
Devroye, L., Lerasle, M., Lugosi, G., Oliveira, R.I.: Sub-Gausssian mean estimators. Ann. Stat. 44, 2695–2725 (2016)
S.B. Hopkins.: Sub-Gaussian mean estimation in polynomial time. arXiv:1809.07425 (2018)
Hsu, D., Sabato, S.: Loss minimization and parameter estimation with heavy tails. J. Mach. Learn. Res. 17, 1–40 (2016)
Jerrum, M., Valiant, L., Vazirani, V.: Random generation of combinatorial structures from a uniform distribution. Theor. Comput. Sci. 43, 186–188 (1986)
Joly, E., Lugosi, G., Oliveira, R.I.: On the estimation of the mean of a random vector. Electron. J. Stat. 11, 440–451 (2017)
Krahmer, F., Mendelson, S., Rauhut, H.: Suprema of chaos processes and the restricted isometry property. Commun. Pure Appl. Math. 26(3), 473–494 (2014)
Ledoux, M., Talagrand, M.: Probability in Banach Space. Springer, New York (1991)
M. Lerasle and R. I. Oliveira.: Robust empirical mean estimators. arXiv:1112.3914 (2012)
G. Lugosi and S. Mendelson.: Sub-Gaussian estimators of the mean of a random vector. Ann. Stat. 47(2), 783–794 (2019)
S. Mendelson.: “Local” vs. “global” parameters—breaking the Gaussian complexity barrier. Ann. Stat. 45(5):1835–1862 (2017)
Geometric median and robust estimation in Banach spaces: S. Minsker. Bernoulli 21, 2308–2335 (2015)
S. Minsker.: Sub-Gaussian estimators of the mean of a random matrix with heavy-tailed entries. arXiv:1605.07129 (2016)
Nemirovsky, A.S., Yudin, D.B.: Problem complexity and method efficiency in optimization. Wiley, Hoboken (1983)
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Gábor Lugosi was supported by the Spanish Ministry of Economy and Competitiveness, Grant MTM2015-67304-P and FEDER, EU.
Rights and permissions
About this article
Cite this article
Lugosi, G., Mendelson, S. Near-optimal mean estimators with respect to general norms. Probab. Theory Relat. Fields 175, 957–973 (2019). https://doi.org/10.1007/s00440-019-00906-4
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00440-019-00906-4