Advertisement

Analysis of two-dimensional approximate pattern matching algorithms

  • Kunsoo Park
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1075)

Abstract

A k-approximate occurrence of a pattern in a text is an occurrence which has Hamming distance at most k with the pattern. The problem of two-dimensional approximate pattern matching is defined as follows: Given a pattern P of size m2, a text T of size n2, and an integer k, find all k-approximate occurrences of P in T.

Kärkkäinen and Ukkonen [7] proposed two algorithms for two-dimensional approximate pattern matching and showed that their expected time for random input is O(kn2(log m)/m2) for km2/4[logσm2], where σ is the size of the alphabet. However, they got the analysis with an independence assumption. In this paper we present a new analysis of the two algorithms which shows that the expected time is the same O(kn2(log m)/m2) for \(k \leqslant \left\lfloor {\frac{m}{{\left\lceil {\log _\sigma m^2 } \right\rceil }}} \right\rfloor \cdot \frac{m}{2} - 1\) without the independence assumption. Hence our analysis is stronger than that of [7] in that (i) it removes the independence assumption and (ii) the range of k is larger. It is also shown that the two algorithms in [7] have an undesirable factor n in their space complexities. We present modifications of these algorithms which use space O(m2) in the worst case and O(k) on average while maintaining the same expected time.

Keywords

Hash Table Independence Assumption Pattern Sample Text Sample Levenshtein Distance 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    A.V. Aho and M.J. Corasick, Efficient string matching: An aid to bibliographic search, Comm. ACM 18 (1975), 333–340.CrossRefGoogle Scholar
  2. 2.
    A. Amir and M. Farach, Efficient 2-dimensional approximate matching of non-rectangular figures, Proc. 2nd ACM-SIAM Symp. Discrete Algorithms, 1991, 212–223.Google Scholar
  3. 3.
    A. Amir and G.M. Landau, Fast parallel and serial multidimensional approximate array matching, Theoret. Comput. Sci. 81 (1991), 97–115.Google Scholar
  4. 4.
    W.I. Chang and E.L. Lawler, Approximate string matching in sublinear expected time, Proc. 31st IEEE Symp. Found. Computer Science, 1990, 116–124.Google Scholar
  5. 5.
    Z. Galil and R. Giancarlo, Data structure and algorithms for approximate string matching, J. Complexity 4 (1988), 33–72.Google Scholar
  6. 6.
    Z. Galil and K. Park, An improved algorithm for approximate string matching, SIAM J. Comput. 19 (1990), 989–999.CrossRefGoogle Scholar
  7. 7.
    J. Kärkkäinen and E. Ukkonen, Two and higher dimensional pattern matching in optimal expected time, Proc. ACM-SIAM Symp. Discrete Algorithms, 1994, 715–723.Google Scholar
  8. 8.
    D.E. Knuth, The Art of Computer Programming, Vol. 3: Sorting and Searching, Addison-Wesley, Reading, MA, 1973.Google Scholar
  9. 9.
    K. Krithivasan and R. Sitalakshmi, Efficient two-dimensional pattern matching in the presence of errors, Information Sciences 43 (1987), 169–184.Google Scholar
  10. 10.
    G.M. Landau and U. Vishkin, Efficient string matching in the presence of errors, Proc. 26th IEEE Symp. Found. Computer Science, 1985, 126–136.Google Scholar
  11. 11.
    S. Ranka and T. Heywood, Two-dimensional pattern matching with k mismatches, Pattern Recognition 24, 1 (1991), 31–40.Google Scholar
  12. 12.
    S.M. Ross, A First Course in Probability, 4th ed., Macmillan, 1994.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1996

Authors and Affiliations

  • Kunsoo Park
    • 1
  1. 1.Department of Computer EngineeringSeoul National UniversitySeoulKorea

Personalised recommendations