On Deterministic Sketching and Streaming for Sparse Recovery and Norm Estimation

  • Jelani Nelson
  • Huy L. Nguyễn
  • David P. Woodruff
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7408)

Abstract

We study classic streaming and sparse recovery problems using deterministic linear sketches, including ℓ1/ℓ1 and ℓ ∞ /ℓ1 sparse recovery problems, norm estimation, and approximate inner product. We focus on devising a fixed matrix A ∈ ℝ m×n and a deterministic recovery/estimation procedure which work for all possible input vectors simultaneously. We contribute several improved bounds for these problems.
  • A proof that ℓ ∞ /ℓ1 sparse recovery and inner product estimation are equivalent, and that incoherent matrices can be used to solve both problems. Our upper bound for the number of measurements is m = O(ε − 2 min {logn, (logn / log(1/ε))2}). We can also obtain fast sketching and recovery algorithms by making use of the Fast Johnson-Lindenstrauss transform. Both our running times and number of measurements improve upon previous work. We can also obtain better error guarantees than previous work in terms of a smaller tail of the input vector.

  • A new lower bound for the number of linear measurements required to solve ℓ1/ℓ1 sparse recovery. We show Ω(k/ε 2 + klog(n/k)/ε) measurements are required to recover an x′ with ∥ x − x′ ∥ 1 ≤ (1 + ε) ∥ x tail(k) ∥ 1, where x tail(k) is x projected onto all but its largest k coordinates in magnitude.

  • A tight bound of m = Θ(ε − 2log(ε 2 n)) on the number of measurements required to solve deterministic norm estimation, i.e., to recover ∥ x ∥ 2 ±ε ∥ x ∥ 1.

For all the problems we study, tight bounds are already known for the randomized complexity from previous work, except in the case of ℓ1/ℓ1 sparse recovery, where a nearly tight bound is known. Our work thus aims to study the deterministic complexities of these problems.

Keywords

Point Query Full Version Recovery Procedure Residue Number System Recovery Algorithm 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Achlioptas, D.: Database-friendly random projections: Johnson-Lindenstrauss with binary coins. J. Comput. Syst. Sci. 66(4), 671–687 (2003)MathSciNetMATHCrossRefGoogle Scholar
  2. 2.
    Ailon, N., Chazelle, B.: The fast Johnson-Lindenstrauss transform and approximate nearest neighbors. SIAM J. Comput. 39(1), 302–322 (2009)MathSciNetMATHCrossRefGoogle Scholar
  3. 3.
    Ailon, N., Liberty, E.: Fast dimension reduction using Rademacher series on dual BCH codes. Discrete & Computational Geometry 42(4), 615–630 (2009)MathSciNetMATHCrossRefGoogle Scholar
  4. 4.
    Ailon, N., Liberty, E.: Almost optimal unrestricted fast Johnson-Lindenstrauss transform. In: Proceedings of the 22nd Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pp. 185–191 (2011)Google Scholar
  5. 5.
    Alon, N.: Problems and results in extremal combinatorics - I. Discrete Mathematics 273(1-3), 31–53 (2003)MathSciNetMATHCrossRefGoogle Scholar
  6. 6.
    Alon, N.: Perturbed identity matrices have high rank: Proof and applications. Combinatorics, Probability & Computing 18(1-2), 3–15 (2009)MATHCrossRefGoogle Scholar
  7. 7.
    Alon, N., Goldreich, O., Håstad, J., Peralta, R.: Simple construction of almost k-wise independent random variables. Rand. Struct. Alg. 3(3), 289–304 (1992)MATHCrossRefGoogle Scholar
  8. 8.
    Alon, N., Matias, Y., Szegedy, M.: The Space Complexity of Approximating the Frequency Moments. JCSS 58(1), 137–147 (1999)MathSciNetMATHGoogle Scholar
  9. 9.
    Ba, K.D., Indyk, P., Price, E., Woodruff, D.P.: Lower bounds for sparse recovery. In: SODA, pp. 1190–1197 (2010)Google Scholar
  10. 10.
    Baraniuk, R., Davenport, M.A., DeVore, R., Wakin, M.: A simple proof of the Restricted Isometry Property. Constructive Approximation 28(3), 253–263 (2008)MathSciNetMATHCrossRefGoogle Scholar
  11. 11.
    Barbará, D., Wu, N., Jajodia, S.: Detecting novel network intrusions using Bayes estimators. In: Proceedings of the 1st SIAM International Conference on Data Mining (2001)Google Scholar
  12. 12.
    Candès, E., Romberg, J., Tao, T.: Robust uncertainty principles: Exact signal reconstruction from highly incomplete frequency information. IEEE Trans. Information Theory 52(2), 489–509 (2006)MATHCrossRefGoogle Scholar
  13. 13.
    Charikar, M., Chen, K., Farach-Colton, M.: Finding frequent items in data streams. Theor. Comput. Sci. 312(1), 3–15 (2004)MathSciNetMATHCrossRefGoogle Scholar
  14. 14.
    Cohen, A., Dahmen, W., DeVore, R.A.: Compressed sensing and best k-term approximation. J. Amer. Math. Soc. 22, 211–231 (2009)MathSciNetMATHCrossRefGoogle Scholar
  15. 15.
    Cormode, G., Muthukrishnan, S.: An improved data stream summary: the count-min sketch and its applications. J. Algorithms 55(1), 58–75 (2005)MathSciNetMATHCrossRefGoogle Scholar
  16. 16.
    Cormode, G., Muthukrishnan, S.: What’s hot and what’s not: tracking most frequent items dynamically. ACM Trans. Database Syst. 30(1), 249–278 (2005)MathSciNetCrossRefGoogle Scholar
  17. 17.
    Demaine, E.D., López-Ortiz, A., Munro, J.I.: Frequency Estimation of Internet Packet Streams with Limited Space. In: Möhring, R.H., Raman, R. (eds.) ESA 2002. LNCS, vol. 2461, pp. 348–360. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  18. 18.
    Donoho, D.L., Huo, X.: Uncertainty principles and ideal atomic decomposition. IEEE Trans. Inform. Th. 47, 2558–2567 (2001)MathSciNetCrossRefGoogle Scholar
  19. 19.
    Foucart, S., Pajor, A., Rauhut, H., Ullrich, T.: The Gelfand widths of ℓp-balls for 0 < p ≤ 1. Journal of Complexity 26(6), 629–640 (2010)MathSciNetMATHCrossRefGoogle Scholar
  20. 20.
    Ganguly, S.: Lower Bounds on Frequency Estimation of Data Streams (Extended Abstract). In: Hirsch, E.A., Razborov, A.A., Semenov, A., Slissenko, A. (eds.) CSR 2008. LNCS, vol. 5010, pp. 204–215. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  21. 21.
    Ganguly, S.: Deterministically Estimating Data Stream Frequencies. In: Du, D.-Z., Hu, X., Pardalos, P.M. (eds.) COCOA 2009. LNCS, vol. 5573, pp. 301–312. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  22. 22.
    Ganguly, S., Majumder, A.: CR-precis: A Deterministic Summary Structure for Update Data Streams. In: Chen, B., Paterson, M., Zhang, G. (eds.) ESCAPE 2007. LNCS, vol. 4614, pp. 48–59. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  23. 23.
    Garnaev, A.Y., Gluskin, E.D.: On the widths of the Euclidean ball. Soviet Mathematics Doklady 30, 200–203 (1984)MATHGoogle Scholar
  24. 24.
    Gilbert, A.C., Kotidis, Y., Muthukrishnan, S., Strauss, M.J.: Quicksand: Quick summary and analysis of network data. DIMACS Technical Report 2001-43 (2001)Google Scholar
  25. 25.
    Gilbert, A.C., Muthukrishnan, S., Strauss, M.: Approximation of functions over redundant dictionaries using coherence. In: SODA, pp. 243–252 (2003)Google Scholar
  26. 26.
    Gilbert, A.C., Strauss, M.J., Tropp, J.A., Vershynin, R.: One sketch for all: fast algorithms for compressed sensing. In: STOC, pp. 237–246 (2007)Google Scholar
  27. 27.
    Gluskin, E.D.: On some finite-dimensional problems in the theory of widths. Vestn. Leningr. Univ. Math. 14, 163–170 (1982)MATHGoogle Scholar
  28. 28.
    Indyk, P., Ružić, M.: Near-optimal sparse recovery in the L 1 norm. In: FOCS, pp. 199–207 (2008)Google Scholar
  29. 29.
    Johnson, W.B., Lindenstrauss, J.: Extensions of Lipschitz mappings into a Hilbert space. Contemporary Mathematics 26, 189–206 (1984)MathSciNetMATHCrossRefGoogle Scholar
  30. 30.
    Jowhari, H., Saglam, M., Tardos, G.: Tight bounds for L p samplers, finding duplicates in streams, and related problems. In: PODS, pp. 49–58 (2011)Google Scholar
  31. 31.
    Kane, D.M., Nelson, J.: Sparser Johnson-Lindenstrauss transforms. In: SODA, pp. 1195–1206 (2012)Google Scholar
  32. 32.
    Karp, R.M., Shenker, S., Papadimitriou, C.H.: A simple algorithm for finding frequent elements in streams and bags. ACM Trans. Database Syst. 28, 51–55 (2003)CrossRefGoogle Scholar
  33. 33.
    Kautz, W.H., Singleton, R.C.: Nonrandom binary superimposed codes. IEEE Trans. Inf. Theory 10, 363–377 (1964)MATHCrossRefGoogle Scholar
  34. 34.
    Krahmer, F., Ward, R.: New and improved Johnson-Lindenstrauss embeddings via the Restricted Isometry Property. SIAM J. Math. Anal. 43(3), 1269–1281 (2011)MathSciNetMATHCrossRefGoogle Scholar
  35. 35.
    Krishna, H., Krishna, B., Lin, K.-Y., Sun, J.-D.: Computational Number Theory and Digital Signal Processing: Fast Algorithms and Error Control Techniques. CRC, Boca Raton (1994)MATHGoogle Scholar
  36. 36.
    Mallat, S.G., Zhang, Z.: Matching pursuits with time-frequency dictionaries. IEEE Trans. Signal Process. 41(12), 3397–3415 (1993)MATHCrossRefGoogle Scholar
  37. 37.
    Misra, J., Gries, D.: Finding repeated elements. Sci. Comput. Program. 2(2), 143–152 (1982)MathSciNetMATHCrossRefGoogle Scholar
  38. 38.
    Naor, J., Naor, M.: Small-bias probability spaces: Efficient constructions and applications. SIAM J. Comput. 22(4), 838–856 (1993)MathSciNetMATHCrossRefGoogle Scholar
  39. 39.
    Price, E., Woodruff, D.P.: (1 + eps)-approximate sparse recovery. In: FOCS, pp. 295–304 (2011)Google Scholar
  40. 40.
    Rudelson, M., Vershynin, R.: On sparse reconstruction from Fourier and Gaussian measurements. Communications on Pure and Applied Mathematics 61, 1025–1045 (2008)MathSciNetMATHCrossRefGoogle Scholar
  41. 41.
    Sivakumar, D.: Algorithmic derandomization via complexity theory. In: STOC, pp. 619–626 (2002)Google Scholar
  42. 42.
    Soderstrand, M.A., Jenkins, W.K., Jullien, G.A., Taylor, F.J.: Residue Number System Arithmetic: Modern Applications in Digital Signal Processing. IEEE Press, New York (1986)MATHGoogle Scholar
  43. 43.
    von zur Gathen, J., Gerhard, J.: Modern Computer Algebra. Cambridge University Press (1999)Google Scholar
  44. 44.
    Watson, R.W., Hastings, C.W.: Self-checked computation using residue arithmetic. Proc. IEEE 4(12), 1920–1931 (1966)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Jelani Nelson
    • 1
  • Huy L. Nguyễn
    • 1
  • David P. Woodruff
    • 2
  1. 1.Princeton UniversityUSA
  2. 2.IBM Almaden Research CenterSan JoseUSA

Personalised recommendations