# Improved Bounds on Sample Size for Implicit Matrix Trace Estimators

## Abstract

This article is concerned with Monte Carlo methods for the estimation of the trace of an implicitly given matrix \(A\) whose information is only available through matrix-vector products. Such a method approximates the trace by an average of \(N\) expressions of the form \( \mathbf{w} ^t (A \mathbf{w} )\), with random vectors \( \mathbf{w} \) drawn from an appropriate distribution. We prove, discuss and experiment with bounds on the number of realizations \(N\) required to guarantee a probabilistic bound on the relative error of the trace estimation upon employing Rademacher (Hutchinson), Gaussian and uniform unit vector (with and without replacement) probability distributions. In total, one necessary bound and six sufficient bounds are proved, improving upon and extending similar estimates obtained in the seminal work of Avron and Toledo (JACM 58(2). Article 8, 2011) in several dimensions. We first improve their bound on \(N\) for the Hutchinson method, dropping a term that relates to \(\mathrm{rank}(A)\) and making the bound comparable with that for the Gaussian estimator. We further prove new sufficient bounds for the Hutchinson, Gaussian and unit vector estimators, as well as a necessary bound for the Gaussian estimator, which depend more specifically on properties of matrix \(A\). As such, they may suggest the type of matrix for which one distribution or another provides a particularly effective or relatively ineffective stochastic estimation method.

## Keywords

Randomized algorithms Trace estimation Monte Carlo methods Implicit linear operators## Mathematics Subject Classification

65C20 65C05 68W20## Notes

### Acknowledgments

We thank our three anonymous referees for several valuable comments, which helped improve the text. Part of this work was completed while the second author was visiting the Instituto Nacional de Matemática Pura e Aplicada (IMPA), Rio de Janeiro, supported by a Brazilian Science Without Borders grant and hosted by Prof. J. Zubelli. Thank you all.

## References

- 1.M. Abramowitz.
*Handbook of Mathematical Functions, with Formulas, Graphs, and Mathematical Tables*. Dover, 1974.Google Scholar - 2.D. Achlioptas. Database-friendly random projections. In
*ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems*, PODS 01, volume 20, pages 274–281, 2001.Google Scholar - 3.H. Avron. Counting triangles in large graphs using randomized matrix trace estimation.
*Workshop on Large-scale Data Mining: Theory and Applications*, 2010.Google Scholar - 4.H. Avron and S. Toledo. Randomized algorithms for estimating the trace of an implicit symmetric positive semi-definite matrix.
*JACM*, 58(2), 2011. Article 8.Google Scholar - 5.Z. Bai, M. Fahey, and G. Golub. Some large scale matrix computation problems.
*J. Comput. Appl. Math.*, 74:71–89, 1996.Google Scholar - 6.C. Bekas, E. Kokiopoulou, and Y. Saad. An estimator for the diagonal of a matrix.
*Appl. Numer. Math.*, 57:12141229, 2007.Google Scholar - 7.K. van den Doel and U. Ascher. Adaptive and stochastic algorithms for EIT and DC resistivity problems with piecewise constant solutions and many measurements.
*SIAM J. Scient. Comput.*, 34: doi: 10.1137/110826692, 2012. - 8.G. H. Golub, M. Heath, and G. Wahba. Generalized cross validation as a method for choosing a good ridge parameter.
*Technometrics*, 21:215–223, 1979.Google Scholar - 9.E. Haber, M. Chung, and F. Herrmann. An effective method for parameter estimation with PDE constraints with multiple right-hand sides.
*SIAM J. Optimization*, 22:739–757, 2012.Google Scholar - 10.M. F. Hutchinson. A stochastic estimator of the trace of the influence matrix for laplacian smoothing splines.
*J. Comm. Stat. Simul.*, 19:433–450, 1990.Google Scholar - 11.T. Van Leeuwen, S. Aravkin, and F. Herrmann. Seismic waveform inversion by stochastic optimization.
*Hindawi Intl. J. Geophysics*, 2011: doi: 10.1155/2011/689041, 2012. - 12.A. Mood, F. A. Graybill, and D. C. Boes.
*Introduction to the Theory of Statistics*. McGraw-Hill; 3rdedition, 1974.Google Scholar - 13.F. Roosta-Khorasani, K. van den Doel, and U. Ascher. Stochastic algorithms for inverse problems involving PDEs and many measurements.
*SIAM J. Scient. Comput.*, 2014. To appear.Google Scholar - 14.R. J. Serfling. Probability inequalities for the sum in sampling without replacement.
*Annals of Statistics*, 2:39–48, 1974.Google Scholar - 15.A. Shapiro, D. Dentcheva, and D. Ruszczynski.
*Lectures on Stochastic Programming: Modeling and Theory*. Philadelphia: SIAM, 2009.Google Scholar - 16.G. J. Székely and N. K. Bakirov. Extremal probabilities for Gaussian quadratic forms.
*Probab. Theory Related Fields*, 126:184–202, 2003.Google Scholar - 17.J. Tropp. Column subset selection, matrix factorization, and eigenvalue optimization.
*SODA*, pages 978–986, 2009. SIAM.Google Scholar - 18.J. Young and D. Ridzal. An application of random projection to parameter estimation in partial differential equations.
*SIAM J. Scient. Comput.*, 34:A2344–A2365, 2012.Google Scholar