Skip to main content

Speeding up two string-matching algorithms

  • Conference paper
  • First Online:
STACS 92 (STACS 1992)

Abstract

We show how to speed up two string-matching algorithms: the Boyer-Moore algorithm (BM algorithm) and its version called here the reversed-factor algorithm (the RF algorithm). The RF algorithm is based on factor graphs for the reverse of the pattern. The main feature of both algorithms is that they scan the text right-to-left from the supposed right position of the pattern, BM algorithm goes as far as the scanned segment is a suffix of the pattern, while the RF algorithm is scanning while it is a factor of the pattern. Then they make a shift of the pattern, forget the history and start again. The RF algorithm usually makes bigger shifts than BM, but is quadratic in the worst case. We show that it is enough to remember the last matched segment to speed up considerably the RF algorithm (to make linear number of comparisons with small coefficient) and to speed up BM algorithm with match-shifts (to make at most 2.n comparisons). Only a constant additional memory is needed for the search phase. We give alternative versions of an accelerated algorithm RF: the first one is based on combinatorial properties of primitive words, and two others use extensively the power of suffix trees.

Work by these authors is partially supported by PRC “Mathématiques-Informatique”.

Work by this author is partially supported by NATO Grant CRG 900293

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. A.V. Aho, Algorithms for finding patterns in strings, in: (J. van Leeuwen, editor, Handbook of Theoretical Computer Science, vol A, Algorithms and complexity, Elsevier, Amsterdam, 1990) 255–300.

    Google Scholar 

  2. A. Apostolico, The myriad virtues of suffix trees, in: (A. Apostolico, Z. Galil, editors, Combinatorial Algorithms on Words, NATO Advanced Science Institutes, Series F, vol. 12, Springer-Verlag, Berlin, 1985) 85–96.

    Google Scholar 

  3. A. Apostolico, R. Giancarlo, The Boyer-Moore-Galil string searching strategies revisited, SIAM J.Comput. 15 (1986) 98–105.

    Google Scholar 

  4. R.A. Baeza-Yates, M. Régnier, Average running time of the Boyer-Moore-Horspool algorithm, Theoret. Comput. Sci. (1991) to appear.

    Google Scholar 

  5. A. Blumer, J. Blumer, A. Ehrenfeucht, D. Haussler, M.T. Chen, J. Seiferas, The smallest automaton recognizing the subwords of a text, Theoret. Comput. Sci. 40 (1985) 31–55.

    Google Scholar 

  6. L. Banachowski, A. Kreczmar, W. Rytter, Analysis of algorithms and data structures, Addison Wesley, 1991.

    Google Scholar 

  7. R.S. Boyer, J.S. Moore, A fast string searching algorithm, Comm. ACM 20 (1977) 762–772.

    Google Scholar 

  8. R. Cole, Tight bounds on the complexity of the Boyer-Moore pattern matching algorithm, in: (2nd annual ACM Symp. on Discrete Algorithms, 1991) 224–233

    Google Scholar 

  9. M. Crochemore, Transducers and repetitions, Theoret. Comput. Sci. 45 (1986) 63–86.

    Google Scholar 

  10. Z. Galil, On improving the worst case running time of the Boyer-Moore string searching algorithm, Comm. ACM 22 (1979) 505–508.

    Google Scholar 

  11. L.J. Guibas, A.M. Odlyzko, A new proof of the linearity of the Boyer-Moore string searching algorithm, SIAM J.Comput. 9 (1980) 672–682.

    Google Scholar 

  12. D.E. Knuth, J.H. Morris Jr, V.R. Pratt, Fast pattern matching in strings, SIAM J.Comput. 6 (1977) 323–350.

    Google Scholar 

  13. T. Lecroq, A variation on Boyer-Moore algorithm, Theoret. Comput. Sci. (1991) to appear.

    Google Scholar 

  14. W. Rytter, A correct preprocessing algorithm for Boyer-Moore string searching, SIAM J.Comput. 9 (1980) 509–512.

    Google Scholar 

  15. A.C. Yao, The complexity of pattern matching for a random string, SIAM J.Comput. 8 (1979) 368–387.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Alain Finkel Matthias Jantzen

Rights and permissions

Reprints and permissions

Copyright information

© 1992 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Crochemore, M. et al. (1992). Speeding up two string-matching algorithms. In: Finkel, A., Jantzen, M. (eds) STACS 92. STACS 1992. Lecture Notes in Computer Science, vol 577. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-55210-3_215

Download citation

  • DOI: https://doi.org/10.1007/3-540-55210-3_215

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-55210-9

  • Online ISBN: 978-3-540-46775-5

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics