Advertisement

Deterministic Sparse Column Based Matrix Reconstruction via Greedy Approximation of SVD

  • Ali Çivril
  • Malik Magdon-Ismail
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5369)

Abstract

Given a matrix A ∈ ℝ m ×n of rank r, and an integer k < r, the top k singular vectors provide the best rank-k approximation to A. When the columns of A have specific meaning, it is desirable to find (provably) “good” approximations to A k which use only a small number of columns in A. Proposed solutions to this problem have thus far focused on randomized algorithms. Our main result is a simple greedy deterministic algorithm with guarantees on the performance and the number of columns chosen. Specifically, our greedy algorithm chooses c columns from A with \(c=O \left({{k^2\log k} \over {\epsilon^2}} \mu^2(A)\ln\left({\sqrt{k}\|{A_k}\|_F} \over {\epsilon}\|{A-A_k}\|_F\right)\right)\) such that

$${\|A-C_{gr}C_{gr}^+A\|}_F \leq \left(1+\epsilon \right)\|{A-A_k}_F,$$

where C gr is the matrix composed of the c columns, \(C_{gr}^+\) is the pseudo-inverse of C gr (\(C_{gr}C_{gr}^+A\) is the best reconstruction of A from C gr ), and μ(A) is a measure of the coherence in the normalized columns of A. The running time of the algorithm is O(SVD(A k ) + mnc) where SVD(A k ) is the running time complexity of computing the first k singular vectors of A. To the best of our knowledge, this is the first deterministic algorithm with performance guarantees on the number of columns and a (1 + ε) approximation ratio in Frobenius norm. The algorithm is quite simple and intuitive and is obtained by combining a generalization of the well known sparse approximation problem from information theory with an existence result on the possibility of sparse approximation. Tightening the analysis along either of these two dimensions would yield improved results.

Keywords

Greedy Algorithm Failure Probability Singular Vector Frobenius Norm Deterministic Algorithm 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Chan, T.F.: Rank revealing QR factorizations. Linear Algebra Appl. (88/89), 67–82 (1987)MathSciNetCrossRefzbMATHGoogle Scholar
  2. 2.
    Chandrasekaran, S., Ipsen, I.C.F.: On rank-revealing factorizations. SIAM J. Matrix Anal. Appl. 15, 592–622 (1994)MathSciNetCrossRefzbMATHGoogle Scholar
  3. 3.
    Chen, S.S., Donoho, D.L., Saunders, M.A.: Atomic decomposition by basis pursuit. SIAM Review 43(1), 129–159 (2001)MathSciNetCrossRefzbMATHGoogle Scholar
  4. 4.
    de Hoog, F.R., Mattheijb, R.M.M.: Subset selection for matrices. Linear Algebra and its Applications (422), 349–359 (2007)MathSciNetCrossRefzbMATHGoogle Scholar
  5. 5.
    Deshpande, A., Rademacher, L., Vempala, S., Wang, G.: Matrix approximation and projective clustering via volume sampling. In: SODA 2006, pp. 1117–1126. ACM Press, New York (2006)Google Scholar
  6. 6.
    Deshpande, A., Varadarajan, K.: Sampling-based dimension reduction for subspace approximation. In: STOC 2007: Proceedings of the thirty-ninth annual ACM symposium on Theory of computing, pp. 641–650. ACM, New York (2007)CrossRefGoogle Scholar
  7. 7.
    Deshpande, A., Vempala, S.: Adaptive sampling and fast low-rank matrix approximation. In: Díaz, J., Jansen, K., Rolim, J.D.P., Zwick, U. (eds.) APPROX 2006 and RANDOM 2006. LNCS, vol. 4110, pp. 292–303. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  8. 8.
    Drineas, P., Frieze, A., Kannan, R., Vempala, S., Vinay, V.: Clustering in large graphs and matrices. In: SODA 1999: Proceedings of the tenth annual ACM-SIAM symposium on Discrete algorithms, pp. 291–299. SIAM, Philadelphia (1999)Google Scholar
  9. 9.
    Drineas, P., Kannan, R., Mahoney, M.W.: Fast monte carlo algorithms for matrices II: Computing a low-rank approximation to a matrix. SIAM Journal on Computing 36(1), 158–183 (2006)MathSciNetCrossRefzbMATHGoogle Scholar
  10. 10.
    Drineas, P., Mahoney, M.W., Muthukrishnan, S.: Subspace sampling and relative-error matrix approximation: column-row-based methods. In: Azar, Y., Erlebach, T. (eds.) ESA 2006. LNCS, vol. 4168, pp. 304–314. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  11. 11.
    Friedman, J.H., Stuetzle, W.: Projection pursuit regressions. J. Amer. Statist. Soc. 76, 817–823 (1981)MathSciNetCrossRefGoogle Scholar
  12. 12.
    Frieze, A., Kannan, R., Vempala, S.: Fast monte-carlo algorithms for finding low-rank approximations. Journal of the Association for Computing Machinery 51(6), 1025–1041 (2004)MathSciNetCrossRefzbMATHGoogle Scholar
  13. 13.
    Golub, G.H., Loan, C.V.: Matrix Computations. Johns Hopkins U. Press (1996)Google Scholar
  14. 14.
    Gu, M., Eisenstat, S.C.: Efficient algorithms for computing a strong rank-revealing QR factorization. SIAM Journal on Scientific Computing 17(4), 848–869 (1996)MathSciNetCrossRefzbMATHGoogle Scholar
  15. 15.
    Hong, Y.P., Pan, C.T.: Rank-revealing QR factorizations and the singular value decomposition. Mathematics of Computation 58, 213–232 (1992)MathSciNetzbMATHGoogle Scholar
  16. 16.
    Kuruvilla, F.G., Park, P.J., Schreiber, S.L.: Vector algebra in the analysis of genome-wide expression data. Genome Biology (3) (2002)Google Scholar
  17. 17.
    Mallat, S., Zhang, Z.: Matching pursuits with time-frequency dictionaries. IEEE Transactions on Signal Processing 41(12), 3397–3415 (1993)CrossRefzbMATHGoogle Scholar
  18. 18.
    Natarajan, B.K.: Sparse approximate solutions to linear systems. SIAM Journal on Computing 24(2), 227–234 (1995)MathSciNetCrossRefzbMATHGoogle Scholar
  19. 19.
    Pan, C.T., Tang, P.T.P.: Bounds on singular values revealed by QR factorizations. BIT Numerical Mathematics 39, 740–756 (1999)MathSciNetCrossRefzbMATHGoogle Scholar
  20. 20.
    Rudelson, M., Vershynin, R.: Sampling from large matrices: An approach through geometric functional analysis. J. ACM 54(4) (2007)Google Scholar
  21. 21.
    Sarlos, T.: Improved approximation algorithms for large matrices via random projections. In: FOCS 2006: Proceedings of the 47th Annual IEEE Symposium on Foundations of Computer Science, Washington, DC, USA, pp. 143–152. IEEE Computer Society Press, Los Alamitos (2006)CrossRefGoogle Scholar
  22. 22.
    Shyamalkumar, N.D., Varadarajan, K.: Efficient subspace approximation algorithms. In: SODA 2007: Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms, pp. 532–540. Society for Industrial and Applied Mathematics, Philadelphia (2007)Google Scholar
  23. 23.
    Tropp, J.A.: Greed is good: algorithmic results for sparse approximation. IEEE Transactions on Information Theory 50(10), 2231–2242 (2004)MathSciNetCrossRefzbMATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Ali Çivril
    • 1
  • Malik Magdon-Ismail
    • 1
  1. 1.Computer Science Department, RPITroyUSA

Personalised recommendations