Abstract
A well-known approach in the design of efficient algorithms, called matrix sparsification, approximates a matrix A with a sparse matrix \(A'\). Achlioptas and McSherry (J ACM 54(2):9-es, 2007) initiated a long line of work on spectral-norm sparsification, which aims to guarantee that \(\Vert A'-A\Vert \le \epsilon \Vert A\Vert \) for error parameter \(\epsilon >0\). Various forms of matrix approximation motivate considering this problem with a guarantee according to the Schatten p-norm for general p, which includes the spectral norm as the special case \(p=\infty \). We investigate the relation between fixed but different \(p\ne q\), that is, whether sparsification in the Schatten p-norm implies (existentially and/or algorithmically) sparsification in the Schatten \(q\text {-norm}\) with similar sparsity. An affirmative answer could be tremendously useful, as it will identify which value of p to focus on. Our main finding is a surprising contrast between this question and the analogous case of \(\ell _p\)-norm sparsification for vectors: For vectors, the answer is affirmative for \(p<q\) and negative for \(p>q\), but for matrices we answer negatively for almost all sufficiently distinct \(p\ne q\). In addition, our explicit constructions may be of independent interest.
Similar content being viewed by others
Notes
We define it for the general case of rectangular matrices, but we focus on square matrices.
For symmetric matrices A, B, we denote \(A\preceq B\) if \(B-A\) is PSD.
Hadamard matrices are not known for every n, but are known for powers of 2. Recall that we assumed n is a power of 2 in Sect. 1.4, thus there exists an \(n\times n\) Hadamard matrix.
References
Arora, S., Hazan, E., Kale, S.: A fast random sampling algorithm for sparsifying matrices. In: Approximation, Randomization, and Combinatorial Optimization Algorithms and Techniques, pp. 272–279. Springer, New York (2006). https://doi.org/10.1007/11830924_26
Achlioptas, D., Karnin, Z.S., Liberty, E.: Near-optimal entrywise sampling for data matrices. In: Advances in Neural Information Processing Systems, pp. 1565–1573 (2013). https://proceedings.neurips.cc/paper/2013/hash/6e0721b2c6977135b916ef286bcb49ec-Abstract.html
Achlioptas, D., McSherry, F.: Fast computation of low-rank matrix approximations. J ACM (JACM) 54(2), 9-es (2007). https://doi.org/10.1145/1219092.1219097
Bakshi, A, Clarkson, K.L., Woodruff, D.P.: Low-rank approximation with 1/\(\epsilon \)\({}^{\text{1/3}}\) matrix-vector products. In: 54th Annual ACM SIGACT Symposium on Theory of Computing, pp. 1130–1143 (2022). https://doi.org/10.1145/3519935.3519988
Bhatia, R.: Matrix Analysis. Graduate Texts in Mathematics, vol. 169. Springer-Verlag, New York (1997). https://doi.org/10.1007/978-1-4612-0653-8
Bhatia, R.: Pinching, trimming, truncating, and averaging of matrices. Am. Math. Mon. 107(7), 602–608 (2000)
Braverman, V., Krauthgamer, R., Krishnan, A., Sapir, S.: Near-optimal entrywise sampling of numerically sparse matrices. In: Conference on Learning Theory, COLT, vol. 134, pp. 759–773. Proceedings of Machine Learning Research. PMLR (2021). http://proceedings.mlr.press/v134/braverman21b.html
Bhatia, R., Kahan, W., Li, R.-C.: Pinchings and norms of scaled triangular matrices. Linear Multilinear Algebra 50(1), 15–21 (2002)
Batson, J., Spielman, D.A., Srivastava, N.: Twice-ramanujan sparsifiers. SIAM J Comput 41(6), 1704–1721 (2012). https://doi.org/10.1137/090772873
Candès, Emmanuel J., Recht, Benjamin: Exact matrix completion via convex optimization. Commun. ACM 55(6), 111–119 (2012). https://doi.org/10.1145/2184319.2184343
Chandrasekaran, V., Sanghavi, S., Parrilo, P.A., Willsky, A.S.: Sparse and low-rank matrix decompositions. IFAC Proc Vol 42(10), 1493–1498 (2009). https://doi.org/10.3182/20090706-3-FR-2004.00249
Candès, Emmanuel J., Tao, Terence: The power of convex relaxation: near-optimal matrix completion. IEEE Trans. Inf. Theory 56(5), 2053–2080 (2010). https://doi.org/10.1109/TIT.2010.2044061
Drineas, P., Zouzias, A.: A note on element-wise matrix sparsification via a matrix-valued Bernstein inequality. Inf. Process. Lett. 111(8), 385–389 (2011). https://doi.org/10.1016/j.ipl.2011.01.010
Gupta, N., Sidford, A.: Exploiting numerical sparsity for efficient learning: faster eigenvector computation and regression. Adv. Neural Inf. Process. Syst. 31, 5269–5278 (2018)
Gittens, A., Tropp, J.A.: Error bounds for random matrix approximation schemes (2009). arXiv:0911.4108
Kundu, A., Drineas, P.: A note on randomized element-wise matrix sparsification (2014). arXiv:1404.0320
Kundu, A., Drineas, P., Magdon-Ismail, M.: Recovering PCA and sparse PCA via hybrid-(\(l_1\), \(l_2\)) sparse sampling of data elements. J. Mach. Learn. Res. 18(75), 1–34 (2017)
Khetan, A., Oh, S.: Spectrum estimation from a few entries. J. Mach. Learn. Res. 20(21), 1–55 (2019)
Kong, W., Valiant, G.: Spectrum estimation from samples. Ann. Stat. 45(5), 2218–2247 (2017). https://doi.org/10.1214/16-AOS1525
Lee, Y.T., Sun, H.: Constructing linear-sized spectral sparsification in almost-linear time. SIAM J. Comput 47(6), 2315–2336 (2018). https://doi.org/10.1137/16M1061850
Li, Y., Woodruff, D.P.: Input-sparsity low rank approximation in schatten norm. In: Proceedings of the 37th International Conference on Machine Learning, ICML, vol. 119, pp. 6001–6009. Proceedings of machine learning research (2020). http://proceedings.mlr.press/v119/li20q.html
Nguyen, N.H., Drineas, P., Tran, T.D.: Tensor sparsification via a bound on the spectral norm of random tensors. Inf. Inference J. IMA 4(3), 195–229 (2015). https://doi.org/10.1093/imaiai/iav004
Nie, F., Huang, H., Ding, C.H.Q.: Low-rank matrix recovery via efficient schatten p-norm minimization. In: Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence (2012). http://www.aaai.org/ocs/index.php/AAAI/AAAI12/paper/view/5165
Pudlák, P., Rödl, V.: Some combinatorial-algebraic problems from complexity theory. Discrete Math. 136(1–3), 253–279 (1994). https://doi.org/10.1016/0012-365X(94)00115-Y
Rotfel’d, S.Y.: Remarks on the singular numbers of a sum of completely continuous operators. Funct. Anal. Appl. 1(3), 95–96 (1967). https://doi.org/10.1007/BF01076915
Spielman, D.A., Teng, S.-H.: Spectral sparsification of graphs. SIAM J. Comput. 40(4), 981–1025 (2011). https://doi.org/10.1137/08074489X
Thompson, R.: Convex and concave functions of singular values of matrix sums. Pac. J. Math. 66(1), 285–290 (1976). https://doi.org/10.2140/pjm.1976.66.285
Valiant L.G.: Graph-theoretic arguments in low-level complexity. In: Mathematical Foundations of Computer Science, 6th Symposium, vol. 53, pp. 162–176. Lecture Notes in Computer Science. Springer (1977). https://doi.org/10.1007/3-540-08353-7_135
Author information
Authors and Affiliations
Contributions
Both authors contributed to all parts of the research, and the formal analysis was carried out mainly by SS.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflicts of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Work partially supported by ONR Award N00014-18-1-2364, the Israel Science Foundation grant #1086/18, the Israeli Council for Higher Education (CHE) via the Weizmann Data Science Research Center, and a Minerva Foundation grant.
Appendices
Appendix
A Proof of Lemma 1.6
In this section, we prove Lemma 1.6.
Lemma 1.6
For all PSD matrices \(A\in \mathbb {R}^{n\times n}\) and \(\epsilon >0\), every \(\epsilon \)-spectral approximation \(A'\) of A is also an \((\epsilon ,S_p)\)-norm approximation of A, simultaneously for all \(p\ge 1\).
Proof
Let \(A'\in \mathbb {R}^{n\times n}\) be an \(\epsilon \)-spectral approximation of A, i.e., \(-\epsilon A \preceq A'-A \preceq \epsilon A\). Observe that the matrix \(A'-A\) is symmetric. Let the eigendecomposition of \(A'-A\) be \(UDU^\top \), including zero eigenvalues so that \(U\in \mathbb {R}^{n\times n}\) is unitary. Denote the i-th column of U by \(u_i\) (which is a normalized eigenvector). Then
where \({\text {diag}}(U^\top A U)\) is a diagonal matrix with the same diagonal as \(U^\top A U\) (and zeros otherwise), and the last inequality holds by Lemma 3.4 (pinching inequality [6]). \(\square \)
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Krauthgamer, R., Sapir, S. Comparison of Matrix Norm Sparsification. Algorithmica 85, 3957–3972 (2023). https://doi.org/10.1007/s00453-023-01172-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00453-023-01172-6