Abstract
A classical measure of similarity between strings is the length of the longest common subsequence(LCS) between the two given strings. The search for efficient algorithms for finding the LCS has been going on for more than three decades. To date, all known algorithms may take quadratic time (shaved by logarithmic factors) to find large LCS. In this paper the problem of approximating LCS is studied, while focusing on the hard inputs for this problem, namely, approximating LCS of near-linear size in strings over relatively large alphabet (of size at least n ε for some constant ε> 0, where n is the length of the string). We show that, any given string over relatively large alphabet can be embedded into a local non-repetitive string. This embedding has a negligible additive distortion for strings that are not too dissimilar in terms of the edit distance. We also show that LCS can be efficiently approximated in locally-non-repetitive strings.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Aho, A.V., Hirschberg, D.S., Ulman, J.D.: Bounds on the complexity on the longest common subsequence problem. JACM 23(1), 1–12 (1976)
Amir, A., Aumann, Y., Kapah, O., Levy, A., Porat, E.: Approximate string matching with address bit errors. In: Ferragina, P., Landau, G.M. (eds.) CPM 2008. LNCS, vol. 5029, pp. 118–129. Springer, Heidelberg (2008)
Andoni, A., Krauthgamer, R.: The computational hardness of estimating edit distance. In: FOCS, pp. 724–734 (2007)
Apostolico, A., Guerra, C.: The longest common subsequence problem revisited. Algorithmica 2, 315–336 (1987)
Baker, B.S., Giancarlo, R.: Longest common subsequence from fragments via sparse dynamic programming. In: Bilardi, G., Pietracaprina, A., Italiano, G.F., Pucci, G. (eds.) ESA 1998. LNCS, vol. 1461, pp. 79–90. Springer, Heidelberg (1998)
Bar-Yossef, Z., Jayram, T.S., Krauthgamer, R., Kumar, R.: Approximating edit distance efficiently. In: FOCS, pp. 550–559 (2004)
Batu, T., Ergün, F., Sahinalp, C.: Oblivious string embeddings and edit distance approximation. In: SODA, pp. 792–801 (2006)
Charikar, M., Krauthgamer, R.: Embedding the Ulam metric into ℓ1. Theory of Computing 2, 207–224 (2006)
Crochemore, M., Landau, G.M., Ziv-Ukelson, M.: A sub-quadratic sequence alignment algorithm for unrestricted cost matrices. SIAM J. Comput. 32(5), 1654–1673 (2003)
Crochemore, M., Porat, E.: Computing a longest increasing subsequence of length k in time o(n log log k). In: Gelenbe, E., Abramsky, S., Sassone, V. (eds.) Visions of computer science, Swindon, UK, pp. 69–74. The British Computer Society (2008)
Gusfield, D.: Algorithms on strings, trees and sequences. Cambridge University Press, Cambridge (1997)
Hirschberg, D.S.: Algorithms for the longest common subsequence problem. JACM 24(4), 664–675 (1977)
Hunt, J.W., Szymanski, T.G.: A fast algorithm for computing longest common subsequences. CACM 20, 350–353 (1977)
Karp, R., Miller, R., Rosenberg, A.: Rapid identification of repeated patterns in strings, arrays and trees. Symposium on the Theory of Computing 4, 125–136 (1972)
Karp, R.M., Rabin, M.O.: Efficient randomized pattern-matching algorithms. IBM Journal of Research and Development 31(2), 249–260 (1987)
Landau, G.M., Scheiber, B., Ziv-Ukelson, M.: Sparse LCS commom substring alignment. Information Processing Letters 88(6), 259–270 (2003)
Landau, G.M., Vishkin, U.: Fast string matching with k differences. Journal of Computer and System Sciences 37(1), 63–78 (1988)
Landau, G.M., Ziv-Ukelson, M.: On the common substring alignment problem. Journal of Algorithms 41(2), 338–359 (2001)
Masek, W.J., Paterson, M.S.: A faster algorithm for computing string edit distances. JCSS 20, 18–31 (1980)
Ostrovsky, R., Rabani, Y.: Low distortion embeddings for edit distance. In: Proceedings of the 37th annual ACM symposium on Theory of computing (STOC), pp. 218–224 (2005)
Sankoff, D.: Matching sequences under deletion/insertion constraints. Pro. Nat. Acad. Sct. USA 69, 4–6 (1972)
Wagner, R.A., Fischer, M.J.: The string to string correction problem. JACM 21(1), 168–173 (1974)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Landau, G.M., Levy, A., Newman, I. (2009). LCS Approximation via Embedding into Local Non-repetitive Strings. In: Kucherov, G., Ukkonen, E. (eds) Combinatorial Pattern Matching. CPM 2009. Lecture Notes in Computer Science, vol 5577. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-02441-2_9
Download citation
DOI: https://doi.org/10.1007/978-3-642-02441-2_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-02440-5
Online ISBN: 978-3-642-02441-2
eBook Packages: Computer ScienceComputer Science (R0)