Advertisement

Improved Sketching of Hamming Distance with Error Correcting

  • Ely Porat
  • Ohad Lipsky
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4580)

Abstract

We address the problem of sketching the hamming distance of data streams. We develop Fixable Sketches which compare data streams or files and restore the differences between them. Our contribution: For two streams with hamming distance bounded by k we show a sketch of size O(klogn) with O(logn) processing time per new element in the stream and how to restore all locations where the two streams differ in time linear in the sketch size. Probability of error is less than 1/n.

Keywords

Data Stream Error Probability Pseudo Random Generator Binary Encode 40th Annual Symposium 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Bar-Yossef, Z., Jayram, T.S., Krauthgamer, R., Kumar, R.: Approximating edit distance efficiently. In: FOCS, pp. 550–559. IEEE Computer Society Press, Los Alamitos (2004)Google Scholar
  2. 2.
    Bar-Yossef, Z., Jayram, T.S, Kumar, R., Sivakumar, D.: Manuscript (2003)Google Scholar
  3. 3.
    Batu, T., Ergün, F., Kilian, J., Magen, A., Raskhodnikova, S., Rubinfeld, R., Sami, R.: A sublinear algorithm for weakly approximating edit distance. In: STOC, pp. 316–324. ACM, New York (2003)Google Scholar
  4. 4.
    Cormode, G., Datar, M., Indyk, P., Muthukrishnan, S.: Comparing data streams using hamming norms (how to zero in). IEEE Trans. Knowl. Data Eng. 15(3), 529–540 (2003)CrossRefGoogle Scholar
  5. 5.
    Cormode, G., Paterson, M., Sahinalp, S.C, Vishkin, U.: Communication complexity of document exchange. In: SODA ’00: Proceedings of the eleventh annual ACM-SIAM symposium on Discrete algorithms, pp. 197–206. Society for Industrial and Applied Mathematics (2000)Google Scholar
  6. 6.
    Feigenbaum, J., Ishai, Y., Malkin, T., Nissim, K., Strauss, M., Wright, R.: Secure multiparty computation of approximations. In: Orejas, F., Spirakis, P.G., van Leeuwen, J. (eds.) ICALP 2001. LNCS, vol. 2076, pp. 927–938. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  7. 7.
    Feigenbaum, J., Kannan, S., Strauss, M., Viswanathan, M.: An approximate l1-difference algorithm for massive data streams. SIAM J. Comput (and in Proceedings of the 40th Annual Symposium on Foundations of Computer Science), 32(1) 131–151, (2002) Appeared in Proceedings of the 40th Annual Symposium on Foundations of Computer Science, pp. 501–511 (1999)Google Scholar
  8. 8.
    Gilbert, A.C, Guha, S., Indyk, P., Kotidis, Y., Muthukrishnan, S., Strauss, M.: Fast, small-space algorithms for approximate histogram maintenance. In: STOC 2002: Proceedings of the thiry-fourth annual ACM symposium on Theory of computing, pp. 389–398. ACM Press, New York (2002)CrossRefGoogle Scholar
  9. 9.
    Guha, S., Koudas, N., Shim, K.: Data-streams and histograms. In: STOC 2001: Proceedings of the thirty-third annual ACM symposium on Theory of computing, pp. 471–475. ACM Press, New York (2001)CrossRefGoogle Scholar
  10. 10.
    Indyk, P.: Stable distributions, pseudorandom generators, embeddings and data stream computation. In: FOCS 2000: Proceedings of the 41st Annual Symposium on Foundations of Computer Science, Washington, DC, USA, p. 189. IEEE Computer Society Press, Los Alamitos (2000)CrossRefGoogle Scholar
  11. 11.
    Kushilevitz, E., Ostrovsky, R., Rabani, Y.: Efficient search for approximate nearest neighbor in high dimensional spaces. SIAM J. Comput. 30(2), 457–474 (2000)zbMATHCrossRefMathSciNetGoogle Scholar
  12. 12.
    Muthukrishnan, S.: Data streams: algorithms and applications. In: SODA ’03: Proceedings of the fourteenth annual ACM-SIAM symposium on Discrete algorithms, pp. 413–413, Philadelphia, PA, USA, Society for Industrial and Applied Mathematics (2003)Google Scholar
  13. 13.
    Starobinski, D., Trachtenberg, A., Agarwal, S.: Efficient pda synchronization. IEEE Trans. Mob. Comput. 2(1), 40–51 (2003)CrossRefGoogle Scholar
  14. 14.
    Trachtenberg, A., Starobinski, D., Agarwal, S.: Fast pda synchronization using characteristic polynomial interpolation. In: INFOCOM (2002)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  • Ely Porat
    • 1
  • Ohad Lipsky
    • 2
  1. 1.Bar-Ilan University, Dept. of Computer Science, 52900 Ramat-Gan, Israel and Google Inc. 
  2. 2.Bar-Ilan University, Dept. of Computer Science, 52900 Ramat-GanIsrael

Personalised recommendations