Advertisement

Subspace Sampling and Relative-Error Matrix Approximation: Column-Row-Based Methods

  • Petros Drineas
  • Michael W. Mahoney
  • S. Muthukrishnan
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4168)

Abstract

Much recent work in the theoretical computer science, linear algebra, and machine learning has considered matrix decompositions of the following form: given an m ×n matrix A, decompose it as a product of three matrices, C, U, and R, where C consists of a small number of columns of A, R consists of a small number of rows of A, and U is a small carefully constructed matrix that guarantees that the product CUR is “close” to A. Applications of such decompositions include the computation of matrix “sketches”, speeding up kernel-based statistical learning, preserving sparsity in low-rank matrix representation, and improved interpretability of data analysis methods. Our main result is a randomized, polynomial algorithm which, given as input an m ×n matrix A, returns as output matrices C, U, R such that

$$\|{A-CUR}\|_F \leq (1+\epsilon)\|{A-A_k}\|_F$$

with probability at least 1–δ. Here, A k is the “best” rank-k approximation (provided by truncating the Singular Value Decomposition of A), and ||X|| F is the Frobenius norm of the matrix X. The number of columns in C and rows in R is a low-degree polynomial in k, 1/ε, and log(1/δ). Our main result is obtained by an extension of our recent relative error approximation algorithm for ℓ2 regression from overconstrained problems to general ℓ2 regression problems. Our algorithm is simple, and it takes time of the order of the time needed to compute the top k right singular vectors of A. In addition, it samples the columns and rows of A via the method of “subspace sampling,” so-named since the sampling probabilities depend on the lengths of the rows of the top singular vectors, and since they ensure that we capture entirely a certain subspace of interest.

Keywords

Singular Value Decomposition Regression Problem Matrix Approximation Matrix Decomposition Theoretical Computer Science 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Berry, M.W., Pulatova, S.A., Stewart, G.W.: Computing sparse reduced-rank approximations to sparse matrices. Technical Report UMIACS TR-2004-32 CMSC TR-4589, University of Maryland, College Park, MD (2004)Google Scholar
  2. 2.
    Bhatia, R.: Matrix Analysis. Springer, New York (1997)Google Scholar
  3. 3.
    Deshpande, A., Rademacher, L., Vempala, S., Wang, G.: Matrix approximation and projective clustering via volume sampling. In: Proceedings of the 17th Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 1117–1126 (2006)Google Scholar
  4. 4.
    Drineas, P., Kannan, R., Mahoney, M.W.: Fast Monte Carlo algorithms for matrices I: Approximating matrix multiplication. SIAM Journal on Computing (to appear)Google Scholar
  5. 5.
    Drineas, P., Kannan, R., Mahoney, M.W.: Fast Monte Carlo algorithms for matrices II: Computing a low-rank approximation to a matrix. SIAM Journal on Computing (to appear)Google Scholar
  6. 6.
    Drineas, P., Kannan, R., Mahoney, M.W.: Fast Monte Carlo algorithms for matrices III: Computing a compressed approximate matrix decomposition. SIAM Journal on Computing (to appear)Google Scholar
  7. 7.
    Drineas, P., Mahoney, M.W.: On the Nyström method for approximating a Gram matrix for improved kernel-based learning. Journal of Machine Learning Research 6, 2153–2175 (2005)MathSciNetGoogle Scholar
  8. 8.
    Drineas, P., Mahoney, M.W., Muthukrishnan, S.: Polynomial time algorithm for column-row based relative-error low-rank matrix approximation. Technical Report 2006-04, DIMACS (March 2006)Google Scholar
  9. 9.
    Drineas, P., Mahoney, M.W., Muthukrishnan, S.: Sampling algorithms for ℓ2. regression and applications. In: Proceedings of the 17th Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 1127–1136 (2006)Google Scholar
  10. 10.
    Golub, G.H., Van Loan, C.F.: Matrix Computations. Johns Hopkins University Press, Baltimore (1989)MATHGoogle Scholar
  11. 11.
    Goreinov, S.A., Tyrtyshnikov, E.E.: The maximum-volume concept in approximation by low-rank matrices. Contemporary Mathematics 280, 47–51 (2001)MathSciNetGoogle Scholar
  12. 12.
    Goreinov, S.A., Tyrtyshnikov, E.E., Zamarashkin, N.L.: A theory of pseudoskeleton approximations. Linear Algebra and Its Applications 261, 1–21 (1997)MATHMathSciNetCrossRefGoogle Scholar
  13. 13.
    Horn, R.A., Johnson, C.R.: Matrix Analysis. Cambridge University Press, New York (1985)MATHGoogle Scholar
  14. 14.
    Kuruvilla, F.G., Park, P.J., Schreiber, S.L.: Vector algebra in the analysis of genome-wide expression data. Genome Biology 3, 0011.1–0011.11 (2002)CrossRefGoogle Scholar
  15. 15.
    Lin, Z., Altman, R.B.: Finding haplotype tagging SNPs by use of principal components analysis. American Journal of Human Genetics 75, 850–861 (2004)CrossRefGoogle Scholar
  16. 16.
    Nashed, M.Z. (ed.): Generalized Inverses and Applications. Academic Press, New York (1976)MATHGoogle Scholar
  17. 17.
    Paschou, P., Mahoney, M.W., Kidd, J.R., Pakstis, A.J., Gu, S., Kidd, K.K., Drineas, P.: Intra- and inter-population genotype reconstruction from tagging SNPs (manuscript submitted for publication)Google Scholar
  18. 18.
    Rademacher, L., Vempala, S., Wang, G.: Matrix approximation and projective clustering via iterative sampling. Technical Report MIT-LCS-TR-983, Massachusetts Institute of Technology, Cambridge, MA (March 2005)Google Scholar
  19. 19.
    Stewart, G.W.: Four algorithms for the efficient computation of truncated QR approximations to a sparse matrix. Numerische Mathematik 83, 313–323 (1999)MATHCrossRefMathSciNetGoogle Scholar
  20. 20.
    Stewart, G.W.: Error analysis of the quasi-Gram-Schmidt algorithm. Technical Report UMIACS TR-2004-17 CMSC TR-4572, University of Maryland, College Park, MD (2004)Google Scholar
  21. 21.
    Williams, C.K.I., Seeger, M.: Using the Nyström method to speed up kernel machines. In: Annual Advances in Neural Information Processing Systems 13: Proceedings of the 2000 Conference, pp. 682–688 (2001)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Petros Drineas
    • 1
  • Michael W. Mahoney
    • 2
  • S. Muthukrishnan
    • 3
  1. 1.Department of Computer ScienceRPI 
  2. 2.Yahoo Research Labs 
  3. 3.Department of Computer ScienceRutgers University 

Personalised recommendations