Advertisement

A Method to Overcome Computer Word Size Limitation in Bit-Parallel Pattern Matching

  • M. Oǧuzhan Külekci
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5369)

Abstract

The performance of the pattern matching algorithms based on bit-parallelism degrades when the input pattern length exceeds the computer word size. Although several divide-and-conquer methods have been proposed to overcome that limitation, the resulting schemes are not that much efficient and hard to implement. This study introduces a new fast bit-parallel pattern matching algorithm that is capable of searching patterns of any length in a common bit-parallel fashion. The proposed bit-parallel length invariant matcher (BLIM) is compared with the Shift-Or and bit-parallel non-deterministic matching (BNDM) algorithms along with the standard Boyer-Moore and Sunday’s quick search, which are known to be the very fast in general. Benchmarks have been conducted on natural language, DNA sequence, and binary alphabet random texts. Besides the length invariant architecture of the algorithm, experimental results indicate that on the average BLIM is 18%, 44%, and 6% faster than BNDM, which is accepted as one of the fastest algorithms of this genre, on natural language, DNA sequence and binary random texts respectively.

Keywords

Input Pattern Text Character Pattern Length Current Window Natural Language Text 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Aho, A., Corasick, M.: Efficient string matching: An aid to bibliographic search. Communications of the ACM 18, 333–340 (1975)MathSciNetCrossRefMATHGoogle Scholar
  2. 2.
    Crochemore, M., Czumaj, A., Gasieniec, L., Jarominek, S., Lecroq, T., Plandowski, W., Rytter, W.: Fast practical multi-pattern matching. Information Processing Letters 71, 107–113 (1993)MathSciNetCrossRefMATHGoogle Scholar
  3. 3.
    Crochemore, M., Czumaj, A., Gasieniec, L., Jarominek, S., Lecroq, T., Plandowski, W., Rytter, W.: Speeding up two string matching algorithms. Algorithmica 12, 247–267 (1994)MathSciNetCrossRefMATHGoogle Scholar
  4. 4.
    Crochemore, M., Rytter, W.: Text Algorithms. Oxford University Press, Oxford (1994)MATHGoogle Scholar
  5. 5.
    Boyer, R., Moore, J.: A fast string searching algorithm. Communications of the ACM 20, 762–772 (1977)CrossRefMATHGoogle Scholar
  6. 6.
    Knuth, D., Morris, J., Pratt, V.: Fast pattern matching in strings. SIAM Journal of Computing 6, 323–350 (1977)MathSciNetCrossRefMATHGoogle Scholar
  7. 7.
    Sunday, D.: A very fast substring search algorithm. Communications of the ACM 33, 132–142 (1990)CrossRefGoogle Scholar
  8. 8.
    Horspool, N.: Practical fast searching in strings. Software – Practice and Experience 10 (1980)Google Scholar
  9. 9.
    Baeza-Yates, R., Gonnet, G.: A new approach to text searching. Communications of the ACM 35, 74–82 (1992)CrossRefGoogle Scholar
  10. 10.
    Wu, S., Manber, U.: Fast text searching allowing errors. Communications of the ACM 35, 83–91 (1992)CrossRefGoogle Scholar
  11. 11.
    Wu, S., Manber, U.: Agrep – a fast approximate pattern-matching tool. In: USENIX Winter, Technical Conference, pp. 153–162 (1992)Google Scholar
  12. 12.
    Navarro, G., Raffinot, M.: Fast and flexible string matching by combining bit-parallelism and suffix automata. ACM Journal of Experimental Algorithms 5, 1–36 (2000)MathSciNetCrossRefMATHGoogle Scholar
  13. 13.
    Peltola, H., Tarhio, J.: Alternative algorithms for bit-parallel string matching. In: Nascimento, M.A., de Moura, E.S., Oliveira, A.L. (eds.) SPIRE 2003. LNCS, vol. 2857, pp. 80–94. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  14. 14.
    Charras, C., Lecroq, T.: Handbook of exact string matching algorithms. King’s Collage Publications (2004)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • M. Oǧuzhan Külekci
    • 1
  1. 1.TÜBİTAK-UEKAE, National Research Institute of Electronics & CryptologyGebzeTurkey

Personalised recommendations