Skip to main content
Log in

Near-optimal mean estimators with respect to general norms

  • Published:
Probability Theory and Related Fields Aims and scope Submit manuscript

Abstract

We study the problem of estimating the mean of a random vector in \(\mathbb {R}^d\) based on an i.i.d. sample, when the accuracy of the estimator is measured by a general norm on \(\mathbb {R}^d\). We construct an estimator (that depends on the norm) that achieves an essentially optimal accuracy/confidence tradeoff under the only assumption that the random vector has a well-defined covariance matrix. At the heart of the argument is the construction of a uniform median-of-means estimator in a class of real valued functions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Notes

  1. In this article we focus on optimal orders of magnitude and ignore the—important-problem of optimizing constants.

  2. Recall that a random vector X is L-sub-Gaussian if for every \(t \in \mathbb {R}^d\) and every \(p \ge 2\), \(\Vert \left\langle X-\mu ,t \right\rangle \Vert _{L_p} \le L \sqrt{p} \Vert \left\langle X-\mu ,t \right\rangle \Vert _{L_2}\).

  3. Here and in what follows we identify linear functionals on \(\mathbb {R}^d\) with points in \(\mathbb {R}^d\), and the action of \(t \in \mathbb {R}^d\) is given by \(x^*(x)=\left\langle t,x \right\rangle \), that is, the standard inner product with t.

  4. For the standard proof of (1.5) for a general sub-Gaussian process, see, e.g., [12].

  5. We thank the anonymous referee for pointing out this fact.

References

  1. Alon, N., Matias, Y., Szegedy, M.: The space complexity of approximating the frequency moments. J. Comput. Syst. Sci. 58, 137–147 (2002)

    Article  MathSciNet  Google Scholar 

  2. Artstein, S., Milman, V., Szarek, S.J.: Duality of metric entropy. Ann. Math. 159, 1313–1328 (2004)

    Article  MathSciNet  Google Scholar 

  3. Boucheron, S., Lugosi, G., Massart, P.: Concentration Inequalities: A Nonasymptotic Theory of Independence. Oxford University Press, Oxford (2013)

    Book  Google Scholar 

  4. Catoni, O.: Challenging the empirical mean and empirical variance: a deviation study. Ann. Inst. Henri Poincaré Probab. Stat. 48(4), 1148–1185 (2012)

    Article  MathSciNet  Google Scholar 

  5. O. Catoni.: PAC-Bayesian bounds for the Gram matrix and least squares regression with a random design. arXiv:1603.05229 (2016)

  6. O. Catoni and I. Giulini.: Dimension-free PAC-Bayesian bounds for matrices, vectors, and linear least squares regression. arXiv:1712.02747 (2017)

  7. Devroye, L., Lerasle, M., Lugosi, G., Oliveira, R.I.: Sub-Gausssian mean estimators. Ann. Stat. 44, 2695–2725 (2016)

    Article  Google Scholar 

  8. S.B. Hopkins.: Sub-Gaussian mean estimation in polynomial time. arXiv:1809.07425 (2018)

  9. Hsu, D., Sabato, S.: Loss minimization and parameter estimation with heavy tails. J. Mach. Learn. Res. 17, 1–40 (2016)

    MathSciNet  MATH  Google Scholar 

  10. Jerrum, M., Valiant, L., Vazirani, V.: Random generation of combinatorial structures from a uniform distribution. Theor. Comput. Sci. 43, 186–188 (1986)

    Article  MathSciNet  Google Scholar 

  11. Joly, E., Lugosi, G., Oliveira, R.I.: On the estimation of the mean of a random vector. Electron. J. Stat. 11, 440–451 (2017)

    Article  MathSciNet  Google Scholar 

  12. Krahmer, F., Mendelson, S., Rauhut, H.: Suprema of chaos processes and the restricted isometry property. Commun. Pure Appl. Math. 26(3), 473–494 (2014)

    MathSciNet  MATH  Google Scholar 

  13. Ledoux, M., Talagrand, M.: Probability in Banach Space. Springer, New York (1991)

    Book  Google Scholar 

  14. M. Lerasle and R. I. Oliveira.: Robust empirical mean estimators. arXiv:1112.3914 (2012)

  15. G. Lugosi and S. Mendelson.: Sub-Gaussian estimators of the mean of a random vector. Ann. Stat. 47(2), 783–794 (2019)

    Article  MathSciNet  Google Scholar 

  16. S. Mendelson.: “Local” vs. “global” parameters—breaking the Gaussian complexity barrier. Ann. Stat. 45(5):1835–1862 (2017)

  17. Geometric median and robust estimation in Banach spaces: S. Minsker. Bernoulli 21, 2308–2335 (2015)

    Article  MathSciNet  Google Scholar 

  18. S. Minsker.: Sub-Gaussian estimators of the mean of a random matrix with heavy-tailed entries. arXiv:1605.07129 (2016)

  19. Nemirovsky, A.S., Yudin, D.B.: Problem complexity and method efficiency in optimization. Wiley, Hoboken (1983)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gábor Lugosi.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Gábor Lugosi was supported by the Spanish Ministry of Economy and Competitiveness, Grant MTM2015-67304-P and FEDER, EU.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lugosi, G., Mendelson, S. Near-optimal mean estimators with respect to general norms. Probab. Theory Relat. Fields 175, 957–973 (2019). https://doi.org/10.1007/s00440-019-00906-4

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00440-019-00906-4

Mathematics Subject Classification

Navigation