Abstract
The Boyer and Moore (BM) pattern matching algorithm is considered as one of the best, but its performance is reduced on binary data. Yet, searching in binary texts has important applications, such as compressed matching. The paper shows how, by means of some pre-computed tables, one may implement the BM algorithm also for the binary case without referring to bits, and processing only entire blocks such as bytes or words, thereby significantly reducing the number of comparisons. Empirical comparisons show that the new variant performs better than regular binary BM and even than BDM.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Knuth, D.E., Morris, J.H., Pratt, V.R.: Fast pattern matching in strings. SIAM J. Comput. 6, 323–350 (1977)
Boyer, R.S., Moore, J.S.: A fast string searching algorithm. Commun. ACM 20, 762–772 (1977)
Crochemore, M., Czumaj, A., Gasieniec, L., Jarominek, S., Lecroq, T., Plandowski, W., Rytter, W.: Speeding up two string-matching algorithms. Algorithmica 12, 247–267 (1994)
de Moura, E.S., Navarro, G., Ziviani, N., Baeza-Yates, R.A.: Fast and flexible word searching on compressed text. ACM Transactions on Information Systems 18, 113–139 (2000)
Brisaboa, N.R., Farina, A., Navarro, G., Esteller, M.F.: (s,c)-dense coding: An optimized compression code for natural language text databases. In: Nascimento, M.A., de Moura, E.S., Oliveira, A.L. (eds.) SPIRE 2003. LNCS, vol. 2857, pp. 122–136. Springer, Heidelberg (2003)
Choueka, Y., Klein, S.T., Perl, Y.: Efficient variants of Huffman codes in high level languages. In: SIGIR 1985. Proceedings of the 8th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 122–130. ACM Press, New York (1985)
Fredriksson, K.: Faster string matching with super-alphabets. In: Laender, A.H.F., Oliveira, A.L. (eds.) SPIRE 2002. LNCS, vol. 2476, pp. 44–57. Springer, Heidelberg (2002)
Navarro, G., Tarhio, J.: Boyer-Moore string matching over Ziv-Lempel compressed text, pp. 166–180 (2000)
Shibata, Y., Matsumoto, T., Takeda, M., Shinohara, A., Arikawa, S.: A Boyer-Moore type algorithm for compressed pattern matching. In: Giancarlo, R., Sankoff, D. (eds.) CPM 2000. LNCS, vol. 1848, pp. 181–194. Springer, Heidelberg (2000)
Bell, T., Powell, M., Mukherjee, A., Adjeroh, D.: Searching BWT compressed text with the Boyer-Moore algorithm and binary search. In: DCC 2002. Proceedings of the Data Compression Conference (DCC 2002), pp. 112–121. IEEE Computer Society Press, Washington, DC, USA (2002)
Klein, S.T., Bookstein, A., Deerwester, S.: Storing text retrieval systems on CD-ROM: compression and encryption considerations. ACM Trans. Inf. Syst. 7, 230–245 (1989)
Crochemore, M., Rytter, W.: Text algorithms. Oxford University Press, Inc., New York (1994)
Horspool, R.N.: Practical fast searching in strings. Software Practice and Experience 10, 501–506 (1980)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Klein, S.T., Kopel Ben-Nissan, M. (2007). Accelerating Boyer Moore Searches on Binary Texts. In: Holub, J., Žďárek, J. (eds) Implementation and Application of Automata. CIAA 2007. Lecture Notes in Computer Science, vol 4783. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-76336-9_14
Download citation
DOI: https://doi.org/10.1007/978-3-540-76336-9_14
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-76335-2
Online ISBN: 978-3-540-76336-9
eBook Packages: Computer ScienceComputer Science (R0)