Abstract
We address the problem of approximate string matching in d dimensions, that is, to find a pattern of size m d in a text of size n d with at most k < m d errors (substitutions, insertions and deletions along any dimension). We use a novel and very flexible error model, for which there exists only an algorithm to evaluate the similarity between two elements in two dimensions at O(m 4) time. We extend the algorithm to d dimensions, at O(d!m 2d) time and O(d!m 2d-1) space.We also give the first search algorithm for such model, which is O(d!m d n d) time and O(d!m d n d-1) space. We show how to reduce the space cost to O(d!3d m 2d-1) with little time penalty. Finally, we present the first sublinear-time (on average) searching algorithm (i.e. not all text cells are inspected), which is O(kn d/m d-1) for k < (m/(d(logσ m- logσ d)))d-1, where σ is the alphabet size. After that error level the filter still remains better than dynamic programming for k ≤ m d-1/(d(logσ m -logσ d))(d-1)/d. These are the first search algorithms for the problem. As side-effects we extend to d dimensions an already proposed algorithm for two-dimensional exact string matching, and we obtain a sublinear-time filter to search in d dimensions allowing k mismatches.
Supported in part by Fondecyt grant 1-990627.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
A. Aho and M. Corasick. Efficient string matching: an aid to bibliographic search. CACM, 18(6):333–340, June 1975.
A. Amir and G. Calinescu. Alphabet independent and dictionary scaled matching. In Proc. CPM’96, number 1075 in LNCS, pages 320–334, 1996.
A. Amir and M. Farach. Efficient 2-dimensional approximate matching of nonrectangular figures. In Proc. SODA’91, pages 212–223, 1991.
A. Amir and G. Landau. Fast parallel and serial multidimensional approximate array matching. Theoretical Computer Science, 81:97–115, 1991.
R. Baeza-Yates. Similarity in two-dimensional strings. In Proc. COCOON’98, number 1449 in LNCS, pages 319–328, Taipei, Taiwan, August 1998.
R. Baeza-Yates and G. Navarro. Fast two-dimensional approximate pattern matching. In Proc. LATIN’98, number 1380 in LNCS, pages 341–351. Springer-Verlag, 1998.
R. Baeza-Yates and G. Navarro. Faster approximate string matching. Algorithmica, 23(2):127–158, 1999. To appear. Preliminary version in Proc. CPM’96.
R. Baeza-Yates and C. Perleberg. Fast and practical approximate pattern matching. In Proc. CPM’92, LNCS 644, pages 185–192, 1992.
R. Baeza-Yates and M. Ré;gnier. Fast two dimensional pattern matching. Information Processing Letters, 45:51–57, 1993.
T. Baker. A technique for extending rapid exact string matching to arrays of more than one dimension. SIAM Journal on Computing, 7:533–541, 1978.
R. Bird. Two dimensional pattern matching. Inf. Proc. Letters, 6:168–170, 1977.
B. Commentz-Walter. A string matching algorithm fast on the average. In Proc. ICALP’79, number 6 in LNCS, pages 118–132. Springer-Verlag, 1979.
M. Crochemore and W. Rytter. Text Algorithms. Oxford University Press, Oxford, UK, 1994.
K. Fredriksson and E. Ukkonen. A rotation invariant_lter for two-dimensional string matching. In Proc. CPM’98, number 1448 in LNCS, pages 118–125, 1998.
J. Karkkäinen and E. Ukkonen. Two and higher dimensional pattern matching in optimal expected time. In Proc. SODA’94, pages 715–723. SIAM, 1994.
K. Krithivasan. Efficient two-dimensional parallel and serial approximate pattern matching. Technical Report CAR-TR-259, University of Maryland, 1987.
K. Krithivasan and R. Sitalakshmi. Efficient two-dimensional pattern matching in the presence of errors. Information Sciences, 43:169–184, 1987.
S. Needleman and C. Wunsch. A general method applicable to the search for similarities in the amino acid sequences of two proteins. J. of Molecular Biology, 48:444–453, 1970.
K. Park. Analysis of two dimensional approximate pattern matching algorithms. In Proc. CPM’96, LNCS 1075, pages 335–347, 1996.
P. Sellers. The theory and computation of evolutionary distances: pattern recognition. J. of Algorithms, 1:359–373, 1980.
S. Wu and U. Manber. Fast text searching allowing errors. CACM, 35(10):83–91, October 1992.
R. Zhu and T. Takaoka. A technique for two-dimensional pattern matching. Comm. ACM, 32(9):1110–1120, 1989.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1999 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Navarro, G., Baeza-Yates, R. (1999). Fast Multi-dimensional Approximate Pattern Matching. In: Crochemore, M., Paterson, M. (eds) Combinatorial Pattern Matching. CPM 1999. Lecture Notes in Computer Science, vol 1645. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-48452-3_18
Download citation
DOI: https://doi.org/10.1007/3-540-48452-3_18
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-66278-5
Online ISBN: 978-3-540-48452-3
eBook Packages: Springer Book Archive