Advertisement

Abstract

We prove that any real matrix A contains a subset of at most 4k/ε+ 2k log(k+1) rows whose span “contains” a matrix of rank at most k with error only (1+ε) times the error of the best rank-k approximation of A. We complement it with an almost matching lower bound by constructing matrices where the span of any k/2ε rows does not “contain” a relative (1+ε)-approximation of rank k. Our existence result leads to an algorithm that finds such rank-k approximation in time

\( O \left( M \left( \frac{k}{\epsilon} + k^{2} \log k \right) + (m+n) \left( \frac{k^{2}}{\epsilon^{2}} + \frac{k^{3} \log k}{\epsilon} + k^{4} \log^{2} k \right) \right), \)

i.e., essentially O(Mk/ε), where M is the number of nonzero entries of A. The algorithm maintains sparsity, and in the streaming model [12,14,15], it can be implemented using only 2(k+1)(log(k+1)+1) passes over the input matrix and \(O \left( \min \{ m, n \} (\frac{k}{\epsilon} + k^{2} \log k) \right)\) additional space. Previous algorithms for low-rank approximation use only one or two passes but obtain an additive approximation.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Arora, S., Hazan, E., Kale, S.: A Fast Random Sampling Algorithm for Sparsifying Matrices. In: Díaz, J., Jansen, K., Rolim, J.D.P., Zwick, U. (eds.) APPROX 2006 and RANDOM 2006. LNCS, vol. 4110, pp. 272–279. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  2. 2.
    Achlioptas, D., McSherry, F.: Fast Computation of Low Rank Approximations. In: Proceedings of the 33rd Annual Symposium on Theory of Computing (2001)Google Scholar
  3. 3.
    Aggarwal, C., Procopiuc, C., Wolf, J., Yu, P., Park, J.: Fast Algorithms for Projected Clustering. In: Proceedings of SIGMOD (1999)Google Scholar
  4. 4.
    Bar-Yosseff, Z.: Sampling Lower Bounds via Information Theory. In: Proceedings of the 35th Annual Symposium on Theory of Computing (2003)Google Scholar
  5. 5.
    de la Vega, W.F., Karpinski, M., Kenyon, C., Rabani, Y.: Approximation schemes for clustering problems. In: Proceedings of the 35th Annual ACM Symposium on Theory of Computing (2003)Google Scholar
  6. 6.
    Drineas, P.: Personal communication (2006)Google Scholar
  7. 7.
    Drineas, P., Frieze, A., Kannan, R., Vempala, S., Vinay, V.: Clustering in large graphs and matrices. In: Proceedings of the 10th SODA (1999)Google Scholar
  8. 8.
    Drineas, P., Kannan, R.: Pass Efficient Algorithm for approximating large matrices. In: Proceedings of 14th SODA (2003)Google Scholar
  9. 9.
    Drineas, P., Kannan, R., Mahoney, M.: Fast Monte Carlo Algorithms for Matrices II: Computing a Low-Rank Approximation to a Matrix. Yale University Technical Report, YALEU/DCS/TR-1270 (2004)Google Scholar
  10. 10.
    Drineas, P., Mahoney, M., Muthukrishnan, S.: Polynomial time algorithm for column-row based relative error low-rank matrix approximation. DIMACS Technical Report 2006-04 (2006)Google Scholar
  11. 11.
    Deshpande, A., Rademacher, L., Vempala, S., Wang, G.: Matrix Approximation and Projective Clustering via Volume Sampling. In: Proceedings of the 17th ACM-SIAM Symposium on Discrete Algorithms (SODA) (2006)Google Scholar
  12. 12.
    Feigenbaum, J., Kannan, S., McGregor, A., Suri, S., Zhang, J.: On Graph Problems in a Semi-Streaming Model. In: Díaz, J., Karhumäki, J., Lepistö, A., Sannella, D. (eds.) ICALP 2004. LNCS, vol. 3142. Springer, Heidelberg (2004)Google Scholar
  13. 13.
    Frieze, A., Kannan, R., Vempala, S.: Fast Monte-Carlo algorithms for finding low-rank approximations. Journal of the ACM 51(6), 1025–1041 (2004)zbMATHCrossRefMathSciNetGoogle Scholar
  14. 14.
    Guha, S., Koudas, N., Shim, K.: Data-streams and histograms. In: Proceedings of 33rd ACM Symposium on Theory of Computing (2001)Google Scholar
  15. 15.
    Henzinger, M., Raghavan, P., Rajagopalan, S.: Computing on Data Streams. Technical Note 1998-011, Digital Systems Research Center, Palo Alto, CA (May 1998)Google Scholar
  16. 16.
    Matoušek, J.: On approximate geometric k-clustering. Discrete and Computational Geometry, 61–84 (2000)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Amit Deshpande
    • 1
  • Santosh Vempala
    • 1
  1. 1.Mathematics Department and CSAILMIT 

Personalised recommendations