A Method to Overcome Computer Word Size Limitation in Bit-Parallel Pattern Matching
The performance of the pattern matching algorithms based on bit-parallelism degrades when the input pattern length exceeds the computer word size. Although several divide-and-conquer methods have been proposed to overcome that limitation, the resulting schemes are not that much efficient and hard to implement. This study introduces a new fast bit-parallel pattern matching algorithm that is capable of searching patterns of any length in a common bit-parallel fashion. The proposed bit-parallel length invariant matcher (BLIM) is compared with the Shift-Or and bit-parallel non-deterministic matching (BNDM) algorithms along with the standard Boyer-Moore and Sunday’s quick search, which are known to be the very fast in general. Benchmarks have been conducted on natural language, DNA sequence, and binary alphabet random texts. Besides the length invariant architecture of the algorithm, experimental results indicate that on the average BLIM is 18%, 44%, and 6% faster than BNDM, which is accepted as one of the fastest algorithms of this genre, on natural language, DNA sequence and binary random texts respectively.
KeywordsInput Pattern Text Character Pattern Length Current Window Natural Language Text
Unable to display preview. Download preview PDF.
- 8.Horspool, N.: Practical fast searching in strings. Software – Practice and Experience 10 (1980)Google Scholar
- 11.Wu, S., Manber, U.: Agrep – a fast approximate pattern-matching tool. In: USENIX Winter, Technical Conference, pp. 153–162 (1992)Google Scholar
- 14.Charras, C., Lecroq, T.: Handbook of exact string matching algorithms. King’s Collage Publications (2004)Google Scholar