Years and Authors of Summarized Original Work
1980; Sellers
1989; Landau, Vishkin
1999; Myers
2003; Crochemore, Landau, Ziv-Ukelson
2004; Fredriksson, Navarro
Problem Definition
Given a text string \(T = t_{1}t_{2}\ldots t_{n}\) and a pattern string \(P = p_{1}p_{2}\ldots p_{m}\), both being sequences over an alphabet \(\Sigma \) of size σ, and given a distance function among strings d and a thresholdk, the approximate string matching (ASM) problem is to find all the text positions that finish the so-called approximate occurrence of P in T, that is, compute the set \(\{j,\exists i,1 \leq i \leq j,d(P,t_{i}\ldots t_{j}) \leq k\}\). In the sequential version of the problem, T, P, and k are given together, whereas the algorithm can be tailored for a specific d.
The solutions to the problem vary widely depending on the distance d used. This entry focuses on a very popular one, called Levenshtein distance or edit distance, defined as the minimum number of character insertions, deletions,...
Recommended Reading
Amir A, Lewenstein M, Porat E (2004) Faster algorithms for string matching with k mismatches. J Algorithms 50(2):257–275
Baeza-Yates R, Navarro G (1999) Faster approximate string matching. Algorithmica 23(2):127–158
Chang W, Marr T (1994) Approximate string matching and local similarity. In: Proceedings of the 5th annual symposium on combinatorial pattern matching (CPM’94), Asilomar. LNCS, vol 807. Springer, Berlin, pp 259–273
Cole R, Hariharan R (2002) Approximate string matching: a simpler faster algorithm. SIAM J Comput 31(6):1761–1782
Crochemore M, Landau G, Ziv-Ukelson M (2003) A subquadratic sequence alignment algorithm for unrestricted scoring matrices. SIAM J Comput 32(6):1654–1673
Fredriksson K, Navarro G (2004) Average-optimal single and multiple approximate string matching. ACM J Exp Algorithms 9(1.4)
Gusfield D (1997) Algorithms on strings, trees and sequences. Cambridge University Press, Cambridge
Landau G, Vishkin U (1989) Fast parallel and serial approximate string matching. J Algorithms 10:157–169
Masek W, Paterson M (1980) A faster algorithm for computing string edit distances. J Comput Syst Sci 20:18–31
Myers G (1999) A fast bit-vector algorithm for approximate string matching based on dynamic progamming. J ACM 46(3):395–415
Navarro G (2001) A guided tour to approximate string matching. ACM Comput Surv 33(1):31–88
Navarro G, Baeza-Yates R (1999) Very fast and simple approximate string matching. Inf Proc Lett 72:65–70
Sellers P (1980) The theory and computation of evolutionary distances: pattern recognition. J Algorithms 1:359–373
Ukkonen E (1985) Finding approximate patterns in strings. J Algorithms 6:132–137
Yao A (1979) The complexity of pattern matching for a random string. SIAM J Comput 8:368–387
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer Science+Business Media New York
About this entry
Cite this entry
Navarro, G. (2014). Approximate String Matching. In: Kao, MY. (eds) Encyclopedia of Algorithms. Springer, Boston, MA. https://doi.org/10.1007/978-3-642-27848-8_363-2
Download citation
DOI: https://doi.org/10.1007/978-3-642-27848-8_363-2
Received:
Accepted:
Published:
Publisher Name: Springer, Boston, MA
Online ISBN: 978-3-642-27848-8
eBook Packages: Springer Reference Computer SciencesReference Module Computer Science and Engineering