Advertisement

Algorithmica

pp 1–28 | Cite as

Streaming Pattern Matching with d Wildcards

  • Shay Golan
  • Tsvi Kopelowitz
  • Ely Porat
Article
  • 4 Downloads

Abstract

In the pattern matching with d wildcards problem one is given a text T of length n and a pattern P of length m that contains d wildcard characters, each denoted by a special symbol ‘?’. A wildcard character matches any other character. The goal is to establish for each m-length substring of T whether it matches P. In the streaming model variant of the pattern matching with d wildcards problem the text T arrives one character at a time and the goal is to report, before the next character arrives, if the last m characters match P while using only o(m) words of space. In this paper we introduce two new algorithms for the d wildcard pattern matching problem in the streaming model. The first is a randomized Monte Carlo algorithm that is parameterized by a constant \(0\le \delta \le 1\). This algorithm uses \(\tilde{O}(d^{1-\delta })\) amortized time per character and \(\tilde{O}(d^{1+\delta })\) words of space. The second algorithm, which is used as a black box in the first algorithm, is a randomized Monte Carlo algorithm which uses \(O(d+\log m)\) worst-case time per character and \(O(d\log m)\) words of space.

Keywords

Pattern matching Streaming algorithms Fingerprints String combinatorics 

Notes

References

  1. 1.
    Alon, N., Matias, Y., Szegedy, M.: The space complexity of approximating the frequency moments. J. Comput. Syst. Sci. 58(1), 137–147 (1999)MathSciNetCrossRefGoogle Scholar
  2. 2.
    Amir, A., Lewenstein, M., Porat, E.: Faster algorithms for string matching with k mismatches. J. Algorithms 50(2), 257–275 (2004)MathSciNetCrossRefGoogle Scholar
  3. 3.
    Berstel, J., Boasson, L.: Partial words and a theorem of fine and wilf. Theor. Comput. Sci. 218(1), 135–141 (1999)MathSciNetCrossRefGoogle Scholar
  4. 4.
    Blanchet-Sadri, F.: Algorithmic Combinatorics on Words. Discrete Mathematics and Its Applications. CRC Press, Boca Raton (2008)zbMATHGoogle Scholar
  5. 5.
    Blanchet-Sadri, F., Hegstrom, R.A.: Partial words and a theorem of fine and wilf revisited. Theor. Comput. Sci. 270(1–2), 401–419 (2002)MathSciNetCrossRefGoogle Scholar
  6. 6.
    Breslauer, D., Galil, Z.: Real-time streaming string-matching. ACM Trans. Algorithms 10(4), 22:1–22:12 (2014)MathSciNetCrossRefGoogle Scholar
  7. 7.
    Breslauer, D., Grossi, R., Mignosi, F.: Simple real-time constant-space string matching. Theor. Comput. Sci. 483, 2–9 (2013)MathSciNetCrossRefGoogle Scholar
  8. 8.
    Cautis, S., Mignosi, F., Shallit, J., Wang, M., Yazdani, S.: Periodicity, morphisms, and matrices. Theor. Comput. Sci. 295, 107–121 (2003)MathSciNetCrossRefGoogle Scholar
  9. 9.
    Clifford, P., Clifford, R.: Simple deterministic wildcard matching. Inf. Process. Lett. 101(2), 53–54 (2007)MathSciNetCrossRefGoogle Scholar
  10. 10.
    Clifford, R., Efremenko, K., Porat, B., Porat, E.: A black box for online approximate pattern matching. Inf. Comput. 209(4), 731–736 (2011)MathSciNetCrossRefGoogle Scholar
  11. 11.
    Clifford, R., Efremenko, K., Porat, E., Rothschild, A.: From coding theory to efficient pattern matching. In: Proceedings of the Twentieth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA, pp. 778–784 (2009)CrossRefGoogle Scholar
  12. 12.
    Clifford, R., Efremenko, K., Porat, E., Rothschild, A.: Pattern matching with don’t cares and few errors. J. Comput. Syst. Sci. 76(2), 115–124 (2010)MathSciNetCrossRefGoogle Scholar
  13. 13.
    Clifford, R., Fontaine, A., Porat, E., Sach, B., Starikovskaya, T.A.: Dictionary matching in a stream. In: Proceedings of the 23rd Annual European Symposium on Algorithms, ESA, pp. 361–372 (2015)CrossRefGoogle Scholar
  14. 14.
    Clifford, R., Fontaine, A., Porat, E., Sach, B., Starikovskaya, T.A.: The k-mismatch problem revisited. In: Proceedings of the Twenty-Seventh Annual ACM-SIAM Symposium on Discrete Algorithms, SODA, pp. 2039–2052 (2016)Google Scholar
  15. 15.
    Clifford, R., Jalsenius, M., Porat, E., Sach, B.: Space lower bounds for online pattern matching. Theor. Comput. Sci. 483, 68–74 (2013)MathSciNetCrossRefGoogle Scholar
  16. 16.
    Clifford, R., Kociumaka, T., Porat, E.: The streaming k-mismatch problem. CoRR. arXiv:1708.05223 (2017)
  17. 17.
    Clifford, R., Porat, E.: A filtering algorithm for k-mismatch with don’t cares. In: Proceedings of the 14th International Symposium on String Processing and Information Retrieval, SPIRE, pp. 130–136 (2007)Google Scholar
  18. 18.
    Clifford, R., Sach, B.: Pseudo-realtime pattern matching: closing the gap. In: Proceedings of the 21st Annual Symposium on Combinatorial Pattern Matching, CPM, pp. 101–111 (2010)CrossRefGoogle Scholar
  19. 19.
    Cole, R., Hariharan, R.: Verifying candidate matches in sparse and wildcard matching. In: Proceedings of the 34th Annual ACM Symposium on Theory of Computing, STOC, pp. 592–601 (2002)Google Scholar
  20. 20.
    Ergün, F., Jowhari, H., Saglam, M.: Periodicity in streams. In: Proceedings of the 14th International Workshop on Randomization and Computation RANDOM, pp. 545–559 (2010)CrossRefGoogle Scholar
  21. 21.
    Fine, N.J., Wilf, H.S.: Uniqueness theorems for periodic functions. Proc. Am. Math. Soc. 16(1), 109–114 (1965)MathSciNetCrossRefGoogle Scholar
  22. 22.
    Fischer, M.J., Paterson, M.S.: String-matching and other products. Technical report, DTIC document (1974)Google Scholar
  23. 23.
    Galil, Z., Seiferas, J.I.: Time-space-optimal string matching. J. Comput. Syst. Sci. 26(3), 280–294 (1983)MathSciNetCrossRefGoogle Scholar
  24. 24.
    Gawrychowski, P.: Optimal pattern matching in LZW compressed strings. ACM Trans. Algorithms 9(3), 25 (2013)MathSciNetCrossRefGoogle Scholar
  25. 25.
    Golan, S., Porat, E.: Real-time streaming multi-pattern search for constant alphabet. In: 25th Annual European Symposium on Algorithms, ESA 2017, pp. 41:1–41:15 (2017)Google Scholar
  26. 26.
    Henzinger, M.R., Raghavan, P., Rajagopalan, S.: External Memory Algorithms, Chapter Computing on Data Streams, pp. 107–118. American Mathematical Society, Providence (1999)CrossRefGoogle Scholar
  27. 27.
    Indyk, P.: Faster algorithms for string matching problems: matching the convolution bound. In: Proceedings of the 39th Annual Symposium on Foundations of Computer Science, FOCS, pp. 166–173 (1998)Google Scholar
  28. 28.
    Jalsenius, M., Porat, B., Sach, B.: Parameterized matching in the streaming model. In: Proceedings of the 30th International Symposium on Theoretical Aspects of Computer Science, STACS, pp. 400–411 (2013)Google Scholar
  29. 29.
    Kalai, A.: Efficient pattern-matching with don’t cares. In: Proceedings of the Thirteenth Annual ACM-SIAM Symposium on Discrete Algorithms SODA, pp. 655–656 (2002)Google Scholar
  30. 30.
    Kane, D.M., Nelson, J., Porat, E., Woodruff, D.P.: Fast moment estimation in data streams in optimal space. In: Proceedings of the 43rd ACM Symposium on Theory of Computing, STOC, pp. 745–754 (2011)Google Scholar
  31. 31.
    Karp, R.M., Rabin, M.O.: Efficient randomized pattern-matching algorithms. IBM J. Res. Dev. 31(2), 249–260 (1987)MathSciNetCrossRefGoogle Scholar
  32. 32.
    Knuth, D.E., Morris Jr., J.H., Pratt, V.R.: Fast pattern matching in strings. SIAM J. Comput. 6(2), 323–350 (1977)MathSciNetCrossRefGoogle Scholar
  33. 33.
    Landau, G.M., Vishkin, U.: Efficient string matching with k mismatches. Theor. Comput. Sci. 43, 239–249 (1986)MathSciNetCrossRefGoogle Scholar
  34. 34.
    Lee, L.-K., Lewenstein, M., Zhang, Q.: Parikh matching in the streaming model. In: Proceedings of the 19th International Symposium on String Processing and Information Retrieval, SPIRE, pp. 336–341 (2012)CrossRefGoogle Scholar
  35. 35.
    Muthukrishnan, S.: Data streams: algorithms and applications. Found. Trends Theor. Comput. Sci. 1(2), 117–236 (2005)MathSciNetCrossRefGoogle Scholar
  36. 36.
    Muthukrishnan, S., Ramesh, H.: String matching under a general matching relation. In: Proceedings of the 12th Conference on Foundations of Software Technology and Theoretical Computer Science, FSTTCS, pp. 356–367 (1992)Google Scholar
  37. 37.
    Porat, B., Porat, E.: Exact and approximate pattern matching in the streaming model. In: Proceedings of the 50th Annual IEEE Symposium on Foundations of Computer Science, FOCS, pp. 315–323 (2009)Google Scholar
  38. 38.
    Porat, E., Lipsky, O.L: Improved sketching of hamming distance with error correcting. In: Proceedings of the 18th Annual Symposium on Combinatorial Pattern Matching, CPM, pp. 173–182 (2007)Google Scholar
  39. 39.
    Rosser, B.J., Schoenfeld, L.: Approximate formulas for some functions of prime numbers. Illinois J. Math. 6, 64–94 (1962)MathSciNetzbMATHGoogle Scholar
  40. 40.
    Smyth, W.F., Wang, S.: A new approach to the periodicity lemma on strings with holes. Theor. Comput. Sci. 410(43), 4295–4302 (2009)MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Bar Ilan UniversityRamat GanIsrael

Personalised recommendations