Skip to main content

A Sequential Recursive Implementation of Dead-Zone Single Keyword Pattern Matching

  • Conference paper
Combinatorial Algorithms (IWOCA 2012)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7643))

Included in the following conference series:

Abstract

Earlier publications provided an abstract specification of a family of single keyword pattern matching algorithms [18] which search unexamined portions of the text in a divide-and-conquer fashion, generating dead-zones in the text as they progress. These dead zones are area of text that require no further examination. Here the results are described of implementing in C++ a sequential recursive version of the algorithm family, where all instances of a single keyword p in a text S are sought—the online keyword matching problem where S may not be precomputed.

We show that each step may involve a window shift of up to \(2 \times |p|\index{set cardinality}-1\) characters—almost twice as much (and therefore potentially almost twice as fast) as the maximum of \(|p|\index{set cardinality}\) characters possible with the Boyer-Moore family of algorithms. Our counterintuitive improvement over Boyer-Moore algorithms is achieved by simultaneously shifting left and right. Ongoing benchmarking shows [12] that such bidirectional shifts are highly efficient—and we make specific comparisons here to Horspool’s algorithm [9], regarded as one of the most efficient algorithms of the Boyer-Moore family.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Berry, T., Ravindran, S.: A fast string matching algorithm and experimental results. In: Holub, J., Simánek, M. (eds.) Proceedings of the Prague Stringology Club Workshop 1999, pp. 16–26. No. Collaborative Report DC-99-05, Czech Technical University, Prague, Czech Republic (1999)

    Google Scholar 

  2. Boyer, R.S., Moore, J.S.: A fast string searching algorithm. Communications of the ACM 20(10), 62–72 (1977)

    Article  Google Scholar 

  3. Charras, C., Lecroq, T.: Handbook of exact string matching algorithms. Kings College Publications (2004)

    Google Scholar 

  4. Cleophas, L., Watson, B.W., Zwaan, G.: A new taxonomy of sublinear right-to-left scanning keyword pattern matching algorithms. Science of Computer Programming 75, 1095–1112 (2010)

    Article  MATH  Google Scholar 

  5. Cleophas, L.G., Watson, B.W.: Taxonomy-Based Software Construction of SPARE Time: A case study. IEE Proceedings — Software 152(1), 29–37 (2005)

    Article  Google Scholar 

  6. Crochemore, M.A., Rytter, W.: Text Algorithms. Oxford University Press (1994)

    Google Scholar 

  7. Crochemore, M.A., Rytter, W.: Jewels of Stringology. World Scientific Publishing Company (2003)

    Google Scholar 

  8. Faro, S., Lecroq, T.: 2001–2010: Ten years of exact string matching algorithms. In: Holub, J., Žďárek, J. (eds.) Proceedings of the Prague Stringology Conference 2011, pp. 1–2. Czech Technical University in Prague, Czech Republic (2011)

    Google Scholar 

  9. Horspool, R.N.: Practical fast searching in strings. Software — Practice & Experience 10(6), 501–506 (1980)

    Article  Google Scholar 

  10. Knuth, D.E., Morris, J., Pratt, V.R.: Fast pattern matching in strings. SIAM Journal of Computing 6(2), 323–350 (1977)

    Article  MathSciNet  MATH  Google Scholar 

  11. Kourie, D.G., Watson, B.W.: The Correctness-by-Construction Approach to Programming. Springer (2012)

    Google Scholar 

  12. Mauch, M., Watson, B.W., Kourie, D.G., Strauss, T.: Performance assessment of dead-zone single keyword pattern matching. In: Kroeze, J. (ed.) Proceedings of the South African Institute of Computer Scientists and Information Technologists Conference, Pretoria, South Africa (October 2012)

    Google Scholar 

  13. Meyer, B.: Object-Oriented Software Construction, 2nd edn. Addison-Wesley (1998)

    Google Scholar 

  14. Smyth, W.F.: Computing Patterns in Strings. Addison-Wesley (2003)

    Google Scholar 

  15. Watson, B.W.: Taxonomies and Toolkits of Regular Language Algorithms. Ph.D dissertation. Eindhoven University of Technology, Eindhoven, Netherlands (1995)

    Google Scholar 

  16. Watson, B.W., Cleophas, L.: SPARE Parts: A C++ toolkit for String Pattern Recognition. Software — Practice & Experience 34(7), 697–710 (2004)

    Article  Google Scholar 

  17. Watson, B.W., Watson, R.E.: A new family of string pattern matching algorithms. In: Holub, J. (ed.) Proceedings of the Second Prague Stringologic Workshop, pp. 12–23. Czech Technical University, Prague, Czech Republic (July 1997)

    Google Scholar 

  18. Watson, B.W., Watson, R.E.: A new family of string pattern matching algorithms. South African Computer Journal 30, 34–41 (2003); for rapid access, A reprint of this article appears on www.fastar.org . This journal remains the appropriate citation reference

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Watson, B.W., Kourie, D.G., Strauss, T. (2012). A Sequential Recursive Implementation of Dead-Zone Single Keyword Pattern Matching. In: Arumugam, S., Smyth, W.F. (eds) Combinatorial Algorithms. IWOCA 2012. Lecture Notes in Computer Science, vol 7643. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35926-2_26

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-35926-2_26

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-35925-5

  • Online ISBN: 978-3-642-35926-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics