Abstract
Earlier publications provided an abstract specification of a family of single keyword pattern matching algorithms [18] which search unexamined portions of the text in a divide-and-conquer fashion, generating dead-zones in the text as they progress. These dead zones are area of text that require no further examination. Here the results are described of implementing in C++ a sequential recursive version of the algorithm family, where all instances of a single keyword p in a text S are sought—the online keyword matching problem where S may not be precomputed.
We show that each step may involve a window shift of up to \(2 \times |p|\index{set cardinality}-1\) characters—almost twice as much (and therefore potentially almost twice as fast) as the maximum of \(|p|\index{set cardinality}\) characters possible with the Boyer-Moore family of algorithms. Our counterintuitive improvement over Boyer-Moore algorithms is achieved by simultaneously shifting left and right. Ongoing benchmarking shows [12] that such bidirectional shifts are highly efficient—and we make specific comparisons here to Horspool’s algorithm [9], regarded as one of the most efficient algorithms of the Boyer-Moore family.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Berry, T., Ravindran, S.: A fast string matching algorithm and experimental results. In: Holub, J., Simánek, M. (eds.) Proceedings of the Prague Stringology Club Workshop 1999, pp. 16–26. No. Collaborative Report DC-99-05, Czech Technical University, Prague, Czech Republic (1999)
Boyer, R.S., Moore, J.S.: A fast string searching algorithm. Communications of the ACM 20(10), 62–72 (1977)
Charras, C., Lecroq, T.: Handbook of exact string matching algorithms. Kings College Publications (2004)
Cleophas, L., Watson, B.W., Zwaan, G.: A new taxonomy of sublinear right-to-left scanning keyword pattern matching algorithms. Science of Computer Programming 75, 1095–1112 (2010)
Cleophas, L.G., Watson, B.W.: Taxonomy-Based Software Construction of SPARE Time: A case study. IEE Proceedings — Software 152(1), 29–37 (2005)
Crochemore, M.A., Rytter, W.: Text Algorithms. Oxford University Press (1994)
Crochemore, M.A., Rytter, W.: Jewels of Stringology. World Scientific Publishing Company (2003)
Faro, S., Lecroq, T.: 2001–2010: Ten years of exact string matching algorithms. In: Holub, J., Žďárek, J. (eds.) Proceedings of the Prague Stringology Conference 2011, pp. 1–2. Czech Technical University in Prague, Czech Republic (2011)
Horspool, R.N.: Practical fast searching in strings. Software — Practice & Experience 10(6), 501–506 (1980)
Knuth, D.E., Morris, J., Pratt, V.R.: Fast pattern matching in strings. SIAM Journal of Computing 6(2), 323–350 (1977)
Kourie, D.G., Watson, B.W.: The Correctness-by-Construction Approach to Programming. Springer (2012)
Mauch, M., Watson, B.W., Kourie, D.G., Strauss, T.: Performance assessment of dead-zone single keyword pattern matching. In: Kroeze, J. (ed.) Proceedings of the South African Institute of Computer Scientists and Information Technologists Conference, Pretoria, South Africa (October 2012)
Meyer, B.: Object-Oriented Software Construction, 2nd edn. Addison-Wesley (1998)
Smyth, W.F.: Computing Patterns in Strings. Addison-Wesley (2003)
Watson, B.W.: Taxonomies and Toolkits of Regular Language Algorithms. Ph.D dissertation. Eindhoven University of Technology, Eindhoven, Netherlands (1995)
Watson, B.W., Cleophas, L.: SPARE Parts: A C++ toolkit for String Pattern Recognition. Software — Practice & Experience 34(7), 697–710 (2004)
Watson, B.W., Watson, R.E.: A new family of string pattern matching algorithms. In: Holub, J. (ed.) Proceedings of the Second Prague Stringologic Workshop, pp. 12–23. Czech Technical University, Prague, Czech Republic (July 1997)
Watson, B.W., Watson, R.E.: A new family of string pattern matching algorithms. South African Computer Journal 30, 34–41 (2003); for rapid access, A reprint of this article appears on www.fastar.org . This journal remains the appropriate citation reference
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Watson, B.W., Kourie, D.G., Strauss, T. (2012). A Sequential Recursive Implementation of Dead-Zone Single Keyword Pattern Matching. In: Arumugam, S., Smyth, W.F. (eds) Combinatorial Algorithms. IWOCA 2012. Lecture Notes in Computer Science, vol 7643. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35926-2_26
Download citation
DOI: https://doi.org/10.1007/978-3-642-35926-2_26
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-35925-5
Online ISBN: 978-3-642-35926-2
eBook Packages: Computer ScienceComputer Science (R0)