Skip to main content

Exact and approximate pattern matching

  • Chapter
  • 1612 Accesses

A fundamental operation with strings is determining whether a pattern of characters or symbols occurs as a substring in a larger string called the text, or as an approximate subsequence in the text. This problem has been investigated since the early 1960s, not only for its theoretical importance in computer science but because it has many applications in information processing and biological sciences. In computer science, string pattern matching algorithms are used in database search and retrieval, text processing and editing, lexical analysis of computer programs, data compression, cryptography and other applications. In recent years, string matching algorithms have been used as powerful tools in the study of genomics and proteomics, in finding genes and regulatory motifs, and in comparative genomics, gene expression analysis and molecular evolutionary theory.

In this chapter we will begin in Section 7.1 by looking at classic exact pattern matching algorithms for uncompressed text, as some of the compresseddomain methods build on these. The section also looks at algorithms for pattern matching with “don't-care” characters, which allow for uncertainty in the search. We then discuss the compressed domain pattern matching problem in Section 7.2, with the main emphasis on the use of the Burrows-Wheeler Transform to aid compressed-domain pattern search. We will then move on to approximate pattern matching algorithms (Section 7.4), including “k-mismatch algorithms” which allow a fixed number of characters to be different between a pattern and the text it matches. Some uncompressed-domain approximate matching methods are introduced, and then implementations of BWT domain approximate pattern matching are described. It is also possible to design hardware algorithms to accelerate pattern matching, and these will be discussed briefly in Section 7.5.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer Science+Business Media, LLC

About this chapter

Cite this chapter

(2008). Exact and approximate pattern matching. In: The Burrows-Wheeler Transform: Data Compression, Suffix Arrays, and Pattern Matching. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-78909-5_7

Download citation

  • DOI: https://doi.org/10.1007/978-0-387-78909-5_7

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-0-387-78908-8

  • Online ISBN: 978-0-387-78909-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics