Skip to main content

Approximate String Matching

  • Living reference work entry
  • First Online:
Encyclopedia of Algorithms

Years and Authors of Summarized Original Work

1980; Sellers

1989; Landau, Vishkin

1999; Myers

2003; Crochemore, Landau, Ziv-Ukelson

2004; Fredriksson, Navarro

Problem Definition

Given a text string \(T = t_{1}t_{2}\ldots t_{n}\) and a pattern string \(P = p_{1}p_{2}\ldots p_{m}\), both being sequences over an alphabet \(\Sigma \) of size σ, and given a distance function among strings d and a thresholdk, the approximate string matching (ASM) problem is to find all the text positions that finish the so-called approximate occurrence of P in T, that is, compute the set \(\{j,\exists i,1 \leq i \leq j,d(P,t_{i}\ldots t_{j}) \leq k\}\). In the sequential version of the problem, T, P, and k are given together, whereas the algorithm can be tailored for a specific d.

The solutions to the problem vary widely depending on the distance d used. This entry focuses on a very popular one, called Levenshtein distance or edit distance, defined as the minimum number of character insertions, deletions,...

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Recommended Reading

  1. Amir A, Lewenstein M, Porat E (2004) Faster algorithms for string matching with k mismatches. J Algorithms 50(2):257–275

    Article  MATH  MathSciNet  Google Scholar 

  2. Baeza-Yates R, Navarro G (1999) Faster approximate string matching. Algorithmica 23(2):127–158

    Article  MATH  MathSciNet  Google Scholar 

  3. Chang W, Marr T (1994) Approximate string matching and local similarity. In: Proceedings of the 5th annual symposium on combinatorial pattern matching (CPM’94), Asilomar. LNCS, vol 807. Springer, Berlin, pp 259–273

    Google Scholar 

  4. Cole R, Hariharan R (2002) Approximate string matching: a simpler faster algorithm. SIAM J Comput 31(6):1761–1782

    Article  MATH  MathSciNet  Google Scholar 

  5. Crochemore M, Landau G, Ziv-Ukelson M (2003) A subquadratic sequence alignment algorithm for unrestricted scoring matrices. SIAM J Comput 32(6):1654–1673

    Article  MATH  MathSciNet  Google Scholar 

  6. Fredriksson K, Navarro G (2004) Average-optimal single and multiple approximate string matching. ACM J Exp Algorithms 9(1.4)

    Google Scholar 

  7. Gusfield D (1997) Algorithms on strings, trees and sequences. Cambridge University Press, Cambridge

    Book  MATH  Google Scholar 

  8. Landau G, Vishkin U (1989) Fast parallel and serial approximate string matching. J Algorithms 10:157–169

    Article  MATH  MathSciNet  Google Scholar 

  9. Masek W, Paterson M (1980) A faster algorithm for computing string edit distances. J Comput Syst Sci 20:18–31

    Article  MATH  MathSciNet  Google Scholar 

  10. Myers G (1999) A fast bit-vector algorithm for approximate string matching based on dynamic progamming. J ACM 46(3):395–415

    Article  MATH  MathSciNet  Google Scholar 

  11. Navarro G (2001) A guided tour to approximate string matching. ACM Comput Surv 33(1):31–88

    Article  Google Scholar 

  12. Navarro G, Baeza-Yates R (1999) Very fast and simple approximate string matching. Inf Proc Lett 72:65–70

    Article  MathSciNet  Google Scholar 

  13. Sellers P (1980) The theory and computation of evolutionary distances: pattern recognition. J Algorithms 1:359–373

    Article  MATH  MathSciNet  Google Scholar 

  14. Ukkonen E (1985) Finding approximate patterns in strings. J Algorithms 6:132–137

    Article  MATH  MathSciNet  Google Scholar 

  15. Yao A (1979) The complexity of pattern matching for a random string. SIAM J Comput 8:368–387

    Article  MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gonzalo Navarro .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer Science+Business Media New York

About this entry

Cite this entry

Navarro, G. (2014). Approximate String Matching. In: Kao, MY. (eds) Encyclopedia of Algorithms. Springer, Boston, MA. https://doi.org/10.1007/978-3-642-27848-8_363-2

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-27848-8_363-2

  • Received:

  • Accepted:

  • Published:

  • Publisher Name: Springer, Boston, MA

  • Online ISBN: 978-3-642-27848-8

  • eBook Packages: Springer Reference Computer SciencesReference Module Computer Science and Engineering

Publish with us

Policies and ethics