Advertisement

Algorithmica

, Volume 12, Issue 4–5, pp 293–311 | Cite as

Performance analysis of some simple heuristics for computing longest common subsequences

  • F. Chin
  • C. K. Poon
Article

Abstract

Although theLongest Common Subsequence (LCS)Problem has been studied by many researchers for years, heuristic methods have not been investigated before. In this paper we present a simple heuristic which guarantees to return a common subsequence of length at least 1/s that of the longest wheres is the number of different symbols in the input strings. Furthermore, we generalize the idea to several classes of heuristic algorithms. Surprisingly, we find that no other heuristic in these classes outperforms this simple algorithm. In other words, we show that any heuristic which uses only global information, such as number of symbol occurrences, might return a common subsequence as short as 1/s of the length of the longest. Analysis of the average performance of the simple heuristic fors=2 is also presented.

Key words

Longest common subsequence Heuristics Performance analysis Scan algorithms 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. [1]
    A. V. Aho, D. S. Hirschberg, and J. D. Ullman, Bounds on the complexity of the maximal common subsequence problem,J. Assoc. Comput. Mach.,23 (1976), 1–12.zbMATHMathSciNetGoogle Scholar
  2. [2]
    A. Apostolico and C. Guerra, The longest common subsequence problem revisited,Algorithmica 2 (1987), 315–336.zbMATHCrossRefMathSciNetGoogle Scholar
  3. [3]
    F. Chin and C. K. Poon, A fast algorithm for computing longest common subsequences of small alphabet size,J. Inform. Process., 13(4) (1990), 463–469. A preliminary version also appeared inProceedings of the International Workshop on Discrete Algorithms and Complexity, 1989, pp. 163–168.zbMATHGoogle Scholar
  4. [4]
    F. Chin and C. K. Poon, Performance of heuristics for the longest common subsequences problem,Proceedings of the 1990 International Computer Symposium, Hsinchu, Taiwan, December 1990, pp. 164–169.Google Scholar
  5. [5]
    G. R. Cross and S. Kuo, Two-Step String-Matching Procedure, Technical Report CS-89-198, Washington State University, 1989.Google Scholar
  6. [6]
    D. S. Hirschberg, A linear space algorithm for computing maximal common subsequences,Comm. ACM,18 (1975), 341–343.zbMATHCrossRefMathSciNetGoogle Scholar
  7. [7]
    D. S. Hirschberg, Algorithms for the longest common subsequence problem,J. Assoc. Comput. Mach.,24 (1977), 664–675.zbMATHMathSciNetGoogle Scholar
  8. [8]
    D. S. Hirschberg, An information-theoretic lower bound for the longest common subsequence problem,Inform. Process. Lett., 7(1) (1978), 40–41.zbMATHCrossRefMathSciNetGoogle Scholar
  9. [9]
    J. W. Hunt and T. G. Szymanski, A fast algorithm for computing longest common subsequences,Comm. ACM,20 (1977), 350–353.zbMATHCrossRefMathSciNetGoogle Scholar
  10. [10]
    W. J. Masek and M. S. Paterson, A faster algorithm computing string edit distances,J. Comput. System Sci,20 (1980), 18–31.zbMATHCrossRefMathSciNetGoogle Scholar
  11. [11]
    E. W. Myers, AnO(ND) difference algorithm and its variations,Algorithmica,1 (1986), 251–266.zbMATHCrossRefMathSciNetGoogle Scholar
  12. [12]
    N. Nakatsu, Y. Kambayashi, and S. Yajima, A longest common subsequence algorithm suitable for similar text strings,Acta Inform.,18 (1982), 171–179.zbMATHCrossRefMathSciNetGoogle Scholar
  13. [13]
    E. Ukkonen, Algorithms for approximate string matching,Inform, and Control,64 (1985), 100–118.zbMATHCrossRefMathSciNetGoogle Scholar
  14. [14]
    R. A. Wagner and M. J. Fischer, The string-to-string correction problem,J. Assoc. Comput. Mach, 21(1) (1974), 168–173.zbMATHMathSciNetGoogle Scholar

Copyright information

© Springer-Verlag New York Inc. 1994

Authors and Affiliations

  • F. Chin
    • 1
  • C. K. Poon
    • 1
  1. 1.Department of Computer ScienceUniversity of Hong KongHong Kong

Personalised recommendations