Worst Case Efficient Single and Multiple String Matching in the RAM Model

Belazzougui, Djamal

doi:10.1007/978-3-642-19222-7_10

Djamal Belazzougui¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6460))

Included in the following conference series:

International Workshop on Combinatorial Algorithms

710 Accesses
4 Citations

Abstract

In this paper, we explore worst-case solutions for the problems of pattern and multi-pattern matching on strings in the RAM model with word length w. In the first problem, we have a pattern p of length m over an alphabet of size σ, and given any text T of length n, where each character is encoded using logσ bit, we wish to find all occurrences of p. For the multi-pattern matching problem we have a set S of d patterns of total length m and a query on a text T consists in finding all the occurrences in T of the patterns in S (in the following we refer by occ to the number of reported occurrences). As each character of the text is encoded using logσ bits and we can read w bits in constant time in the RAM model, the best query time for the two problems which can only possibly be achieved by reading Θ(w/logσ) consecutive characters, is \(O(n\frac{\log\sigma}{w}+occ)\). In this paper, we present two results. The first result is that using O(m) words of space, single pattern matching queries can be answered in time \(O(n(\frac{\log m}{m}+\frac{\log \sigma}{w})+occ)\), and multiple pattern matching queries answered in time \(O(n(\frac{\log d+\log y+\log\log m}{y}+\frac{\log \sigma}{w})+occ)\), where y is the length of the shortest pattern. Our second result is a variant of the first result which uses the four Russian technique to remove the dependence on the shortest pattern length at the expense of using an additional space t. It answers to multi-pattern matching queries in time \(O(n\frac{\log d+\log\log_\sigma t+\log\log m}{\log_\sigma t}+occ)\) using O(m + t) words of space.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Aho, A.V., Corasick, M.J.: Efficient string matching: An aid to bibliographic search. ACM Commun. 18(6), 333–340 (1975)
Article MATH Google Scholar
Arlazarov, V.L., Dinic, E.A., Kronrod, M.A., Faradzev, I.A.: On economical construction of the transitive closure of a directed graph. Soviet Mathematics Doklady 11(5), 1209–1210 (1970)
MATH Google Scholar
Belazzougui, D.: Succinct dictionary matching with no slowdown. In: Amir, A., Parida, L. (eds.) CPM 2010. LNCS, vol. 6129, pp. 88–100. Springer, Heidelberg (2010)
Chapter Google Scholar
Bille, P.: Fast searching in packed strings. In: Kucherov, G., Ukkonen, E. (eds.) CPM 2009. LNCS, vol. 5577, pp. 116–126. Springer, Heidelberg (2009)
Chapter Google Scholar
Boyer, R.S., Moore, J.S.: A fast string searching algorithm. ACM Commun. 20(10), 762–772 (1977)
Article MATH Google Scholar
Chazelle, B.: Filtering search: A new approach to query-answering. SIAM J. Comput. 15(3), 703–724 (1986)
Article MATH Google Scholar
Chien, Y.-F., Hon, W.-K., Shah, R., Vitter, J.S.: Geometric Burrows-Wheeler transform: Linking range searching and text indexing. In: DCC, pp. 252–261 (2008)
Google Scholar
Crochemore, M., Czumaj, A., Gasieniec, L., Jarominek, S., Lecroq, T., Plandowski, W., Rytter, W.: Speeding up two string-matching algorithms. Algorithmica 12(4/5), 247–267 (1994)
Article MATH Google Scholar
Crochemore, M., Rytter, W.: Text Algorithms. Oxford University Press, Oxford (1994)
MATH Google Scholar
Dietzfelbinger, M., Gil, J., Matias, Y., Pippenger, N.: Polynomial hash functions are reliable (extended abstract). In: ICALP, pp. 235–246 (1992)
Google Scholar
Ferragina, P., Grossi, R.: The string b-tree: A new data structure for string search in external memory and its applications. J. ACM 46(2), 236–280 (1999)
Article MATH Google Scholar
Fredman, M.L., Komlós, J., Szemerédi, E.: Storing a sparse table with 0(1) worst case access time. J. ACM 31(3), 538–544 (1984)
Article MATH Google Scholar
Fredriksson, K.: Faster string matching with super-alphabets. In: Laender, A.H.F., Oliveira, A.L. (eds.) SPIRE 2002. LNCS, vol. 2476, pp. 44–57. Springer, Heidelberg (2002)
Chapter Google Scholar
Hagerup, T., Tholey, T.: Efficient minimal perfect hashing in nearly minimal space. In: Ferreira, A., Reichel, H. (eds.) STACS 2001. LNCS, vol. 2010, pp. 317–326. Springer, Heidelberg (2001)
Chapter Google Scholar
Knuth, D.E., Morris Jr., J.H., Pratt, V.R.: Fast pattern matching in strings. SIAM J. Comput. 6(2), 323–350 (1977)
Article MATH Google Scholar
Manber, U., Myers, E.W.: Suffix arrays: A new method for on-line string searches. SIAM J. Comput. 22(5), 935–948 (1993)
Article MATH Google Scholar
Navarro, G.: Indexing text using the ziv-lempel trie. J. Discrete Algorithms 2(1), 87–114 (2004)
Article MATH Google Scholar
Navarro, G., Raffinot, M.: A bit-parallel approach to suffix automata: Fast extended string matching. In: Farach-Colton, M. (ed.) CPM 1998. LNCS, vol. 1448, pp. 14–33. Springer, Heidelberg (1998)
Chapter Google Scholar
Patrascu, M.: (data) structures. In: FOCS, pp. 434–443 (2008)
Google Scholar
Rivals, E., Salmela, L., Kiiskinen, P., Kalsi, P., Tarhio, J.: mpscan: Fast localisation of multiple reads in genomes. In: Salzberg, S.L., Warnow, T. (eds.) WABI 2009. LNCS, vol. 5724, pp. 246–260. Springer, Heidelberg (2009)
Chapter Google Scholar
Tam, A., Wu, E., Lam, T.W., Yiu, S.-M.: Succinct text indexing with wildcards. In: Karlgren, J., Tarhio, J., Hyyrö, H. (eds.) SPIRE 2009. LNCS, vol. 5721, pp. 39–50. Springer, Heidelberg (2009)
Chapter Google Scholar
van Emde Boas, P., Kaas, R., Zijlstra, E.: Design and implementation of an efficient priority queue. Mathematical Systems Theory 10, 99–127 (1977)
Article MATH Google Scholar
Willard, D.E.: Log-logarithmic worst-case range queries are possible in space theta(n). Inf. Process. Lett. 17(2), 81–84 (1983)
Article MATH Google Scholar
Yao, A.C.-C.: The complexity of pattern matching for a random string. SIAM J. Comput. 8(3), 368–387 (1979)
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

LIAFA, Univ. Paris Diderot-Paris 7, 75205, Paris, Cedex 13, France
Djamal Belazzougui

Authors

Djamal Belazzougui
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, University of London, King’s College, The Strand, WC2R 2LS, London, UK
Costas S. Iliopoulos
Department of Computing and Software, McMaster University, 1280 Main Street West, L8S 4K1, Hamilton, ON, Canada
William F. Smyth

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Belazzougui, D. (2011). Worst Case Efficient Single and Multiple String Matching in the RAM Model. In: Iliopoulos, C.S., Smyth, W.F. (eds) Combinatorial Algorithms. IWOCA 2010. Lecture Notes in Computer Science, vol 6460. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-19222-7_10

Download citation

DOI: https://doi.org/10.1007/978-3-642-19222-7_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-19221-0
Online ISBN: 978-3-642-19222-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics