Journal of Theoretical Probability

, Volume 25, Issue 3, pp 655–686 | Cite as

How Close is the Sample Covariance Matrix to the Actual Covariance Matrix?

Article

Abstract

Given a probability distribution in ℝn with general (nonwhite) covariance, a classical estimator of the covariance matrix is the sample covariance matrix obtained from a sample of N independent points. What is the optimal sample size N=N(n) that guarantees estimation with a fixed accuracy in the operator norm? Suppose that the distribution is supported in a centered Euclidean ball of radius \(O(\sqrt{n})\). We conjecture that the optimal sample size is N=O(n) for all distributions with finite fourth moment, and we prove this up to an iterated logarithmic factor. This problem is motivated by the optimal theorem of Rudelson (J. Funct. Anal. 164:60–72, 1999), which states that N=O(nlog n) for distributions with finite second moment, and a recent result of Adamczak et al. (J. Am. Math. Soc. 234:535–561, 2010), which guarantees that N=O(n) for subexponential distributions.

Keywords

Sample covariance matrices Estimation of covariance matrices Random matrices with independent columns 

Mathematics Subject Classification (2000)

60H12 60B20 46B09 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Adamczak, R., Litvak, A., Pajor, A., Tomczak-Jaegermann, N.: Quantitative estimates of the convergence of the empirical covariance matrix in log-concave ensembles. J. Am. Math. Soc. 234, 535–561 (2010) MathSciNetGoogle Scholar
  2. 2.
    Adamczak, R., Litvak, A., Pajor, A., Tomczak-Jaegermann, N.: Sharp bounds on the rate of convergence of the empirical covariance matrix. Preprint Google Scholar
  3. 3.
    Aubrun, G.: Sampling convex bodies: a random matrix approach. Proc. Am. Math. Soc. 135, 1293–1303 (2007) MathSciNetMATHCrossRefGoogle Scholar
  4. 4.
    Ball, K.: An elementary introduction to modern convex geometry. In: Flavors of Geometry. Math. Sci. Res. Inst. Publ., vol. 31, pp. 1–58. Cambridge Univ. Press, Cambridge (1997) Google Scholar
  5. 5.
    Bai, Z.D., Yin, Y.Q.: Limit of the smallest eigenvalue of a large dimensional sample covariance matrix. Ann. Probab. 21, 1275–1294 (1993) MathSciNetMATHCrossRefGoogle Scholar
  6. 6.
    Bourgain, J.: Random points in isotropic convex sets. In: Convex Geometric Analysis, Berkeley, CA, 1996. Math. Sci. Res. Inst. Publ., vol. 34, pp. 53–58. Cambridge Univ. Press, Cambridge (1999) Google Scholar
  7. 7.
    Dahmen, J., Keysers, D., Pitz, M., Ney, H.: Structured covariance matrices for statistical image object recognition. In: 22nd Symposium of the German Association for Pattern Recognition, pp. 99–106. Springer, Berlin (2000) Google Scholar
  8. 8.
    Giannopoulos, A., Hartzoulaki, M., Tsolomitis, A.: Random points in isotropic unconditional convex bodies. J. Lond. Math. Soc. 72, 779–798 (2005) MathSciNetMATHCrossRefGoogle Scholar
  9. 9.
    Giannopoulos, A., Milman, V.D.: Euclidean structure in finite dimensional normed spaces. In: Handbook of the Geometry of Banach Spaces, vol. I, pp. 707–779. North-Holland, Amsterdam (2001) CrossRefGoogle Scholar
  10. 10.
    Kovačević, E., Chebira, A.: An Introduction to Frames, Foundations and Trends in Signal Processing. Now Publishers, Hanover (2008) Google Scholar
  11. 11.
    Kannan, R., Lovász, L., Simonovits, M.: Random walks and O (n 5) volume algorithm for convex bodies. Random Struct. Algorithms 2, 1–50 (1997) CrossRefGoogle Scholar
  12. 12.
    Kannan, R., Rademacher, L.: Optimization of a convex program with a polynomial perturbation. Oper. Res. Lett. 37, 384–386 (2009) MathSciNetMATHCrossRefGoogle Scholar
  13. 13.
    Krim, H., Viberg, M.: Two decades of array signal processing research: the parametric approach. IEEE Signal Process. Mag. 13, 67–94 (1996) CrossRefGoogle Scholar
  14. 14.
    Latala, R.: Some estimates of norms of random matrices. Proc. Am. Math. Soc. 133, 1273–1282 (2005) MathSciNetMATHCrossRefGoogle Scholar
  15. 15.
    Ledoit, O., Wolf, M.: Improved estimation of the covariance matrix of stock returns with an application to portfolio selection. J. Empir. Finance 10, 603–621 (2003) CrossRefGoogle Scholar
  16. 16.
    Ledoux, M., Talagrand, M.: Probability in Banach spaces. Isoperimetry and Processes. Ergebnisse der Mathematik und ihrer Grenzgebiete (3), vol. 23. Springer, Berlin (1991) MATHGoogle Scholar
  17. 17.
    Levina, E., Vershynin, R.: Partial estimation of covariance matrices. Preprint Google Scholar
  18. 18.
    Milman, V., Schechtman, G.: Asymptotic Theory of Finite Dimensional Normed Spaces. Lecture Notes in Mathematics, vol. 1200. Springer, Berlin (1986) MATHGoogle Scholar
  19. 19.
    Paouris, G.: Concentration of mass on convex bodies. Geom. Funct. Anal. 16, 1021–1049 (2006) MathSciNetMATHCrossRefGoogle Scholar
  20. 20.
    Pisier, G.: Remarques sur un résultat non publié de B. Maurey. In: Seminar on Functional Analysis (1980–1981), Ecole Polytech, Palaiseau, 1981. Exp. no V, 13 pp. Google Scholar
  21. 21.
    Rothman, A.J., Levina, E., Zhu, J.: Generalized thresholding of large covariance matrices. J. Am. Stat. Assoc. (Theory Methods) 104, 177–186 (2009) MathSciNetCrossRefGoogle Scholar
  22. 22.
    Rudelson, M.: Random vectors in the isotropic position. J. Funct. Anal. 164, 60–72 (1999) MathSciNetMATHCrossRefGoogle Scholar
  23. 23.
    Rudelson, M., Vershynin, R.: Non-asymptotic theory of random matrices: extreme singular values. In: Proceedings of the International Congress of Mathematicians, Hyderabad, India (2010), to appear Google Scholar
  24. 24.
    Schäfer, J., Strimmer, K.: A shrinkage approach to large-scale covariance matrix estimation and implications for functional genomics. Stat. Appl. Genet. Mol. Biol. 4, 32 (2005) MathSciNetGoogle Scholar
  25. 25.
    Seginer, Y.: The expected norm of random matrices. Combin. Probab. Comput. 9, 149–166 (2000) MathSciNetMATHCrossRefGoogle Scholar
  26. 26.
    Van der Vaart, A.W., Wellner, J.A.: Weak Convergence and Empirical Processes. Springer, Berlin (1996) MATHGoogle Scholar
  27. 27.
    Vershynin, R.: Frame expansions with erasures: an approach through the non-commutative operator theory. Appl. Comput. Harmon. Anal. 18, 167–176 (2005) MathSciNetMATHCrossRefGoogle Scholar
  28. 28.
    Vershynin, R.: Spectral norm of products of random and deterministic matrices. Probab. Theory Relat. Fields. doi:10.1007/s00440-010-0281-z
  29. 29.
    Vershynin, R.: Approximating the moments of marginals of high dimensional distributions. Ann. Probab. (to appear) Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2011

Authors and Affiliations

  1. 1.Department of MathematicsUniversity of MichiganAnn ArborUSA

Personalised recommendations