Abstract
We investigate the problem of deterministic pattern matching in multiple streams. In this model, one symbol arrives at a time and is associated with one of s streaming texts. The task at each time step is to report if there is a new match between a fixed pattern of length m and a newly updated stream. As is usual in the streaming context, the goal is to use as little space as possible while still reporting matches quickly. We give almost matching upper and lower space bounds for three distinct pattern matching problems. For exact matching we show that the problem can be solved in constant time per arriving symbol and O(m + s) words of space. For the k-mismatch and k-difference problems we give O(k) time solutions that require O(m + ks) words of space. In all three cases we also give space lower bounds which show our methods are optimal up to a single logarithmic factor. Finally we set out a number of open problems related to this new model for pattern matching.
This work was partially supported by EPSRC.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Abrahamson, K.: Generalized string matching. SIAM Journal on Computing 16(6), 1039–1051 (1987)
Amir, A., Landau, G.M., Lewenstein, M., Sokol, D.: Dynamic text and static pattern matching. ACM Transactions on Algorithms (TALG) 3(2) (2007)
Amir, A., Lewenstein, M., Porat, E.: Faster algorithms for string matching with k mismatches. Journal of Algorithms 50(2), 257–275 (2004)
Breslauer, D., Galil, Z.: Real-Time Streaming String-Matching. In: Giancarlo, R., Manzini, G. (eds.) CPM 2011. LNCS, vol. 6661, pp. 162–172. Springer, Heidelberg (2011)
Clifford, R., Efremenko, K., Porat, B., Porat, E.: A Black Box for Online Approximate Pattern Matching. In: Ferragina, P., Landau, G.M. (eds.) CPM 2008. LNCS, vol. 5029, pp. 143–151. Springer, Heidelberg (2008)
Clifford, R., Efremenko, K., Porat, B., Porat, E.: A black box for online approximate pattern matching. Information and Computation 209(4), 731–736 (2011)
Clifford, R., Jalsenius, M., Porat, E., Sach, B.: Space Lower Bounds for Online Pattern Matching. In: Giancarlo, R., Manzini, G. (eds.) CPM 2011. LNCS, vol. 6661, pp. 184–196. Springer, Heidelberg (2011)
Clifford, R., Sach, B.: Pseudo-realtime Pattern Matching: Closing the Gap. In: Amir, A., Parida, L. (eds.) CPM 2010. LNCS, vol. 6129, pp. 101–111. Springer, Heidelberg (2010)
Clifford, R., Sach, B.: Pattern matching in pseudo real-time. Journal of Discrete Algorithms 9(1), 67–81 (2011)
Ergun, F., Jowhari, H., Sağlam, M.: Periodicity in Streams. In: Serna, M., Shaltiel, R., Jansen, K., Rolim, J. (eds.) APPROX and RANDOM 2010, LNCS, vol. 6302, pp. 545–559. Springer, Heidelberg (2010)
Galil, Z.: String matching in real time. Journal of the ACM 28(1), 134–149 (1981)
Indyk, P.: Faster algorithms for string matching problems: Matching the convolution bound. In: FOCS 1998: Proc. 39th Annual Symp. Foundations of Computer Science, pp. 166–173 (1998)
Jayram, T.S., Kumar, R., Sivakumar, D.: The one-way communication complexity of hamming distance. Theory of Computing 4(1), 129–135 (2008)
Karloff, H.: Fast algorithms for approximately counting mismatches. Information Processing Letters 48(2), 53–60 (1993)
Kosaraju, S.R.: Efficient string matching (1987) (manuscript)
Kushilevitz, E., Nisan, N.: Communication complexity. Cambridge University Press (1997)
Landau, G.M., Vishkin, U.: Efficient string matching in the presence of errors. In: FOCS 1985: Proc. 26th Annual Symp. Foundations of Computer Science, pp. 126–136 (1985)
Landau, G.M., Vishkin, U.: Efficient string matching with k mismatches. Theoretical Computer Science 43, 239–249 (1986)
Landau, G.M., Vishkin, U.: Fast string matching with k differences. Journal of Computer System Sciences 37(1), 63–78 (1988)
Porat, B., Porat, E.: Exact and approximate pattern matching in the streaming model. In: FOCS 2009: Proc. 50th Annual Symp. Foundations of Computer Science, pp. 315–323 (2009)
Ružić, M.: Constructing Efficient Dictionaries in Close to Sorting Time. In: Aceto, L., Damgård, I., Goldberg, L.A., Halldórsson, M.M., Ingólfsdóttir, A., Walukiewicz, I. (eds.) ICALP 2008, Part I. LNCS, vol. 5125, pp. 84–95. Springer, Heidelberg (2008)
Simon, I.: String matching algorithms and automata. In: First American Workshop on String Processing, pp. 151–157 (1993)
Yao, A.C.-C.: Some complexity questions related to distributive computing. In: STOC 1979: Proc. 11th Annual ACM Symp. Theory of Computing, pp. 209–213 (1979)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Clifford, R., Jalsenius, M., Porat, E., Sach, B. (2012). Pattern Matching in Multiple Streams. In: Kärkkäinen, J., Stoye, J. (eds) Combinatorial Pattern Matching. CPM 2012. Lecture Notes in Computer Science, vol 7354. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-31265-6_8
Download citation
DOI: https://doi.org/10.1007/978-3-642-31265-6_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-31264-9
Online ISBN: 978-3-642-31265-6
eBook Packages: Computer ScienceComputer Science (R0)