A longest common subsequence algorithm suitable for similar text strings
- 328 Downloads
Efficient algorithms for computing the longest common subsequence (LCS for short) are discussed. O(pn) algorithm and O(p(m-p) log n) algorithm [Hirschberg 1977] seem to be best among previously known algorithms, where p is the length of an LCS and m and n are the lengths of given two strings (m≦n). There are many applications where the expected length of an LCS is close to m.
In this paper, O(n(m-p)) algorithm is presented. When p is close to m (in other words, two given strings are similar), the algorithm presented here runs much faster than previously known algorithms.
KeywordsInformation System Operating System Data Structure Communication Network Information Theory
Unable to display preview. Download preview PDF.
- 1.Aho, A.V., Hirschberg, D.S., Ullman, J.D.: Bounds on the Complexity of the Longest Common Subsequence Problem. J. ACM 23, 1, 1–12 (1976)Google Scholar
- 2.Hunt, J.W., Szymanski, T.G.: A Fast Algorithm for Computing Longest Common Subsequences. C. ACM 20, 5, 350–353 (1977)Google Scholar
- 3.Hirschberg, D.S.: Algorithms for the Longest Common Subsequence Problem. J. ACM 24, 4, 664–675 (1977)Google Scholar
- 4.Hirschberg, D.S.: An Information-Theoretic Lower Bound for the Longest Common Subsequence Problem. Inform. Process. Lett. 7, 1, 40–41 (1978)Google Scholar
- 5.Kambayashi, Y., Nakatsu, N., Yajima, S.: Hierarchical String Pattern Matching Using Dynamic Pattern Matching Machines. Proc. IEEE COMPSAC 79, 813–818 (1979)Google Scholar
- 6.Mukhopadhyay, A.: A Fast Algorithm for the Longest-Common-Subsequence Problem. Inform. Sci. 20, 69–82 (1980)Google Scholar