Abstract
Given a matrix A ∈ ℝm ×n of rank r, and an integer k < r, the top k singular vectors provide the best rank-k approximation to A. When the columns of A have specific meaning, it is desirable to find (provably) “good” approximations to A k which use only a small number of columns in A. Proposed solutions to this problem have thus far focused on randomized algorithms. Our main result is a simple greedy deterministic algorithm with guarantees on the performance and the number of columns chosen. Specifically, our greedy algorithm chooses c columns from A with \(c=O \left({{k^2\log k} \over {\epsilon^2}} \mu^2(A)\ln\left({\sqrt{k}\|{A_k}\|_F} \over {\epsilon}\|{A-A_k}\|_F\right)\right)\) such that
where C gr is the matrix composed of the c columns, \(C_{gr}^+\) is the pseudo-inverse of C gr (\(C_{gr}C_{gr}^+A\) is the best reconstruction of A from C gr ), and μ(A) is a measure of the coherence in the normalized columns of A. The running time of the algorithm is O(SVD(A k ) + mnc) where SVD(A k ) is the running time complexity of computing the first k singular vectors of A. To the best of our knowledge, this is the first deterministic algorithm with performance guarantees on the number of columns and a (1 + ε) approximation ratio in Frobenius norm. The algorithm is quite simple and intuitive and is obtained by combining a generalization of the well known sparse approximation problem from information theory with an existence result on the possibility of sparse approximation. Tightening the analysis along either of these two dimensions would yield improved results.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Chan, T.F.: Rank revealing QR factorizations. Linear Algebra Appl. (88/89), 67–82 (1987)
Chandrasekaran, S., Ipsen, I.C.F.: On rank-revealing factorizations. SIAM J. Matrix Anal. Appl. 15, 592–622 (1994)
Chen, S.S., Donoho, D.L., Saunders, M.A.: Atomic decomposition by basis pursuit. SIAM Review 43(1), 129–159 (2001)
de Hoog, F.R., Mattheijb, R.M.M.: Subset selection for matrices. Linear Algebra and its Applications (422), 349–359 (2007)
Deshpande, A., Rademacher, L., Vempala, S., Wang, G.: Matrix approximation and projective clustering via volume sampling. In: SODA 2006, pp. 1117–1126. ACM Press, New York (2006)
Deshpande, A., Varadarajan, K.: Sampling-based dimension reduction for subspace approximation. In: STOC 2007: Proceedings of the thirty-ninth annual ACM symposium on Theory of computing, pp. 641–650. ACM, New York (2007)
Deshpande, A., Vempala, S.: Adaptive sampling and fast low-rank matrix approximation. In: Díaz, J., Jansen, K., Rolim, J.D.P., Zwick, U. (eds.) APPROX 2006 and RANDOM 2006. LNCS, vol. 4110, pp. 292–303. Springer, Heidelberg (2006)
Drineas, P., Frieze, A., Kannan, R., Vempala, S., Vinay, V.: Clustering in large graphs and matrices. In: SODA 1999: Proceedings of the tenth annual ACM-SIAM symposium on Discrete algorithms, pp. 291–299. SIAM, Philadelphia (1999)
Drineas, P., Kannan, R., Mahoney, M.W.: Fast monte carlo algorithms for matrices II: Computing a low-rank approximation to a matrix. SIAM Journal on Computing 36(1), 158–183 (2006)
Drineas, P., Mahoney, M.W., Muthukrishnan, S.: Subspace sampling and relative-error matrix approximation: column-row-based methods. In: Azar, Y., Erlebach, T. (eds.) ESA 2006. LNCS, vol. 4168, pp. 304–314. Springer, Heidelberg (2006)
Friedman, J.H., Stuetzle, W.: Projection pursuit regressions. J. Amer. Statist. Soc. 76, 817–823 (1981)
Frieze, A., Kannan, R., Vempala, S.: Fast monte-carlo algorithms for finding low-rank approximations. Journal of the Association for Computing Machinery 51(6), 1025–1041 (2004)
Golub, G.H., Loan, C.V.: Matrix Computations. Johns Hopkins U. Press (1996)
Gu, M., Eisenstat, S.C.: Efficient algorithms for computing a strong rank-revealing QR factorization. SIAM Journal on Scientific Computing 17(4), 848–869 (1996)
Hong, Y.P., Pan, C.T.: Rank-revealing QR factorizations and the singular value decomposition. Mathematics of Computation 58, 213–232 (1992)
Kuruvilla, F.G., Park, P.J., Schreiber, S.L.: Vector algebra in the analysis of genome-wide expression data. Genome Biology (3) (2002)
Mallat, S., Zhang, Z.: Matching pursuits with time-frequency dictionaries. IEEE Transactions on Signal Processing 41(12), 3397–3415 (1993)
Natarajan, B.K.: Sparse approximate solutions to linear systems. SIAM Journal on Computing 24(2), 227–234 (1995)
Pan, C.T., Tang, P.T.P.: Bounds on singular values revealed by QR factorizations. BIT Numerical Mathematics 39, 740–756 (1999)
Rudelson, M., Vershynin, R.: Sampling from large matrices: An approach through geometric functional analysis. J. ACM 54(4) (2007)
Sarlos, T.: Improved approximation algorithms for large matrices via random projections. In: FOCS 2006: Proceedings of the 47th Annual IEEE Symposium on Foundations of Computer Science, Washington, DC, USA, pp. 143–152. IEEE Computer Society Press, Los Alamitos (2006)
Shyamalkumar, N.D., Varadarajan, K.: Efficient subspace approximation algorithms. In: SODA 2007: Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms, pp. 532–540. Society for Industrial and Applied Mathematics, Philadelphia (2007)
Tropp, J.A.: Greed is good: algorithmic results for sparse approximation. IEEE Transactions on Information Theory 50(10), 2231–2242 (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Çivril, A., Magdon-Ismail, M. (2008). Deterministic Sparse Column Based Matrix Reconstruction via Greedy Approximation of SVD. In: Hong, SH., Nagamochi, H., Fukunaga, T. (eds) Algorithms and Computation. ISAAC 2008. Lecture Notes in Computer Science, vol 5369. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-92182-0_38
Download citation
DOI: https://doi.org/10.1007/978-3-540-92182-0_38
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-92181-3
Online ISBN: 978-3-540-92182-0
eBook Packages: Computer ScienceComputer Science (R0)