Years and Authors of Summarized Original Work
1980; Sellers
1989; Landau, Vishkin
1999; Myers
2003; Crochemore, Landau, Ziv-Ukelson
2004; Fredriksson, Navarro
Problem Definition
Given a text string T = t1t2… t n and a pattern string P = p1p2… p m , both being sequences over an alphabet \(\Sigma \) of size \(\sigma\), and given a distance function among strings d and a threshold k, the approximate string matching (ASM) problem is to find all the text positions that finish the so-called approximate occurrence of P in T, that is, compute the set \(\{j,\exists i,1 \leq i \leq j,d(P,t_{i}\ldots t_{j}) \leq k\}\). In the sequential version of the problem, T, P, and k are given together, whereas the algorithm can be tailored for a specific d.
The solutions to the problem vary widely depending on the distance d used. This entry focuses on a very popular one, called Levenshtein distance or edit distance, defined as the minimum number of character insertions, deletions, and substitutions...
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsRecommended Reading
Amir A, Lewenstein M, Porat E (2004) Faster algorithms for string matching with k mismatches. J Algorithms 50(2):257–275
Baeza-Yates R, Navarro G (1999) Faster approximate string matching. Algorithmica 23(2):127–158
Chang W, Marr T (1994) Approximate string matching and local similarity. In: Proceedings of the 5th annual symposium on combinatorial pattern matching (CPM’94), Asilomar. LNCS, vol 807. Springer, Berlin, pp 259–273
Cole R, Hariharan R (2002) Approximate string matching: a simpler faster algorithm. SIAM J Comput 31(6):1761–1782
Crochemore M, Landau G, Ziv-Ukelson M (2003) A subquadratic sequence alignment algorithm for unrestricted scoring matrices. SIAM J Comput 32(6):1654–1673
Fredriksson K, Navarro G (2004) Average-optimal single and multiple approximate string matching. ACM J Exp Algorithms 9(1.4)
Gusfield D (1997) Algorithms on strings, trees and sequences. Cambridge University Press, Cambridge
Landau G, Vishkin U (1989) Fast parallel and serial approximate string matching. J Algorithms 10:157–169
Masek W, Paterson M (1980) A faster algorithm for computing string edit distances. J Comput Syst Sci 20:18–31
Myers G (1999) A fast bit-vector algorithm for approximate string matching based on dynamic progamming. J ACM 46(3):395–415
Navarro G (2001) A guided tour to approximate string matching. ACM Comput Surv 33(1):31–88
Navarro G, Baeza-Yates R (1999) Very fast and simple approximate string matching. Inf Proc Lett 72:65–70
Sellers P (1980) The theory and computation of evolutionary distances: pattern recognition. J Algorithms 1:359–373
Ukkonen E (1985) Finding approximate patterns in strings. J Algorithms 6:132–137
Yao A (1979) The complexity of pattern matching for a random string. SIAM J Comput 8:368–387
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer Science+Business Media New York
About this entry
Cite this entry
Navarro, G. (2016). Approximate String Matching. In: Kao, MY. (eds) Encyclopedia of Algorithms. Springer, New York, NY. https://doi.org/10.1007/978-1-4939-2864-4_363
Download citation
DOI: https://doi.org/10.1007/978-1-4939-2864-4_363
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4939-2863-7
Online ISBN: 978-1-4939-2864-4
eBook Packages: Computer ScienceReference Module Computer Science and Engineering