Approximate string matching with don't care characters

  • Tatsuya Akutsu
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 807)

Abstract

This paper presents parallel and serial approximate matching algorithms for strings with don't care characters. They are based on Landau and Vishkin's approximate string matching algorithm and Fisher and Paterson's exact string matching algorithm with don't care characters. The serial algorithm works in O(√kmn log¦Σ¦ log2m/k log log m/k) time, and the parallel algorithm works in O(k log m) time using O(√m/kn log ¦Σ¦ log m/k log log m/k) Processors on a CRCW-PRAM, where n denotes the length of a text string, m denotes the length of a pattern string, k denotes the maximum number of differences, and ∑ denotes the alphabet (i.e. the set of characters). Several extensions are also described.

Keywords

approximate string matching don't care characters sequence analysis 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    K. Abrahamson. “Genaralized string matching”. SIAM Journal on Computing, Vol. 16, pp. 1039–1051, 1987.Google Scholar
  2. 2.
    A. Amir and G. Landau. “Fast parallel and serial multidimensional approximate array matching”. Theoretical Computer Science, Vol. 81, pp. 97–115, 1991.Google Scholar
  3. 3.
    A. Apostolico, C. Iliopoulos, G. M. Landau, B. Schieber, and U. Vishkin. “Parallel construction of a suffix tree with applications”. Algorithmica, Vol. 3, pp. 347–365, 1988.Google Scholar
  4. 4.
    C. Branden and J. Tooze. Introduction to Protein Structure. Garland Publishing Inc., New York, 1991.Google Scholar
  5. 5.
    M. Fisher and M. Paterson. “String matching and other products”. In Complexity of Computation (SIAM-AMS Proceedings), volume 7, pp. 113–125, 1974.Google Scholar
  6. 6.
    Z. Galil and R. Giancarlo. “Data structures and algorithms for approximate string matching”. Journal of Complexity, Vol. 4, pp. 33–72, 1988.Google Scholar
  7. 7.
    Z. Galil and K. Park. “An improved algorithm for approximate string matching”. SIAM Journal on Computing, Vol. 19, pp. 989–999, 1990.Google Scholar
  8. 8.
    G. Heijne. Sequence Analysis in Molecular Biology — Treasure Trove or Trivial Pursuit. Academic Press, Inc., San Diego, 1987.Google Scholar
  9. 9.
    J. JáJá. An Introduction to Parallel Algorithms. Addison-Wesley, Massachusetts, 1992.Google Scholar
  10. 10.
    G. M. Landau and U. Vishkin. “Fast parallel and serial approximate string matching”. Journal of Algorithms, Vol. 10, pp. 157–169, 1989.Google Scholar
  11. 11.
    U. Manber and R. Baeza-Yates. “An algorithm for string matching with a sequence of don't cares”. Information Processing Letters, Vol. 37, pp. 133–136, 1991.Google Scholar
  12. 12.
    P. Weiner. “Linear pattern matching algorithms”. In Proceedings of IEEE Symposium on Switching and Automata Theory, pp. 1–11, 1973.Google Scholar

Copyright information

© Springer-Verlag 1994

Authors and Affiliations

  • Tatsuya Akutsu
    • 1
  1. 1.Mechanical Engineering LaboratoryIbarakiJapan

Personalised recommendations