Two algorithms for the longest common subsequence of three (or more) strings

  • Robert W. Irving
  • Campbell B. Fraser
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 644)

Abstract

Various algorithms have been proposed, over the years, for the longest common subsequence problem on 2 strings (2-LCS), many of these improving, at least for some cases, on the classical dynamic programming approach. However, relatively little attention has been paid in the literature to the k-LCS problem for k > 2, a problem that has interesting applications in areas such as the multiple alignment of sequences in molecular biology.

In this paper, we describe and analyse two algorithms with particular reference to the 3-LCS problem, though each algorithm can be extended to solve the k-LCS problem for general k. The first algorithm, which can be viewed as a “lazy” version of dynamic programming, has time and space complexity that is O(n(n−1)2) for 3 strings, and O(kn(n−1)k}-1) for k strings, where n is the common length of the strings and l is the length of an LCS. The second algorithm, which involves evaluating entries in a “threshold” table in diagonal order, has time and space complexity that is O(l(n−1)2+sn) for 3 strings, and O(kl(n−1)k−1+ksn) for k strings, where s is the alphabet size. For simplicity, the algorithms are presented for equal-length strings, though extension to unequal-length strings is straightforward.

Empirical evidence is presented to show that both algorithms show significant improvement on the basic dynamic programming approach, and on an earlier algorithm proposed by Hsu and Du, particularly, as would be expected, in the case where l is relatively large, with the balance of evidence being heavily in favour of the threshold approach.

Key words

string algorithms longest common subsequence 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    A. Apostolico. Improving the worst-case performance of the Hunt-Szymanski strategy for the longest common subsequence of two strings. Information Processing Letters, 23:63–69, 1986.Google Scholar
  2. 2.
    A. Apostolico, S. Browne, and C. Guerra. Fast linear-space computations of longest common subsequences. Theoretical Computer Science, 92:3–17, 1992.Google Scholar
  3. 3.
    A. Apostolico and C. Guerra. The longest common subsequence problem revisited. Algorithmica, 2:315–336, 1987.Google Scholar
  4. 4.
    D.S. Hirschberg. A linear space algorithm for computing maximal common subsequences. Communications of the A.C.M., 18:341–343, 1975.Google Scholar
  5. 5.
    D.S. Hirschberg. Algorithms for the longest common subsequence problem. Journal of the A.C.M., 24:664–675, 1977.Google Scholar
  6. 6.
    W.J. Hsu and M.W. Du. Computing a longest common subsequence for a set of strings. BIT, 24:45–59, 1984.Google Scholar
  7. 7.
    J.W. Hunt and T.G. Szymanski. A fast algorithm for computing longest common subsequences. Communications of the A.C.M., 20:350–353, 1977.Google Scholar
  8. 8.
    S.Y. Itoga. The string merging problem. BIT, 21:20–30, 1981.Google Scholar
  9. 9.
    W.J. Masek and M.S. Paterson. A faster algorithm for computing string editing distances. J. Comput. System Sci., 20:18–31, 1980.Google Scholar
  10. 10.
    E.W. Myers. An O(ND) difference algorithm and its variations. Algorithmica, 1:251–266, 1986.Google Scholar
  11. 11.
    N. Nakatsu, Y. Kambayashi, and S. Yajima. A longest common subsequence algorithm suitable for similar text strings. Acta Informatica, 18:171–179, 1982.Google Scholar
  12. 12.
    D. Sankoff. Matching sequences under deletion insertion constraints. Proc. Nat. Acad. Sci. U.S.A., 69:4–6, 1972.Google Scholar
  13. 13.
    E. Ukkonen. Algorithms for approximate string matching. Information and Control, 64:100–118, 1985.Google Scholar
  14. 14.
    R.A. Wagner and M.J. Fischer. The string-to-string correction problem. Journal of the A.C.M., 21:168–173, 1974.Google Scholar
  15. 15.
    S. Wu, U. Manber, G. Myers, and W. Miller. An O(NP) sequence comparison algorithm. Information Processing Letters, 35:317–323, 1990.Google Scholar

Copyright information

© Springer-Verlag 1992

Authors and Affiliations

  • Robert W. Irving
    • 1
  • Campbell B. Fraser
    • 1
  1. 1.Computing Science DepartmentUniversity of GlasgowGlasgowScotland

Personalised recommendations