Skip to main content

LCS Approximation via Embedding into Local Non-repetitive Strings

  • Conference paper
Combinatorial Pattern Matching (CPM 2009)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5577))

Included in the following conference series:

  • 637 Accesses

Abstract

A classical measure of similarity between strings is the length of the longest common subsequence(LCS) between the two given strings. The search for efficient algorithms for finding the LCS has been going on for more than three decades. To date, all known algorithms may take quadratic time (shaved by logarithmic factors) to find large LCS. In this paper the problem of approximating LCS is studied, while focusing on the hard inputs for this problem, namely, approximating LCS of near-linear size in strings over relatively large alphabet (of size at least n ε for some constant ε> 0, where n is the length of the string). We show that, any given string over relatively large alphabet can be embedded into a local non-repetitive string. This embedding has a negligible additive distortion for strings that are not too dissimilar in terms of the edit distance. We also show that LCS can be efficiently approximated in locally-non-repetitive strings.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Aho, A.V., Hirschberg, D.S., Ulman, J.D.: Bounds on the complexity on the longest common subsequence problem. JACM 23(1), 1–12 (1976)

    Article  MathSciNet  MATH  Google Scholar 

  2. Amir, A., Aumann, Y., Kapah, O., Levy, A., Porat, E.: Approximate string matching with address bit errors. In: Ferragina, P., Landau, G.M. (eds.) CPM 2008. LNCS, vol. 5029, pp. 118–129. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  3. Andoni, A., Krauthgamer, R.: The computational hardness of estimating edit distance. In: FOCS, pp. 724–734 (2007)

    Google Scholar 

  4. Apostolico, A., Guerra, C.: The longest common subsequence problem revisited. Algorithmica 2, 315–336 (1987)

    Article  MathSciNet  MATH  Google Scholar 

  5. Baker, B.S., Giancarlo, R.: Longest common subsequence from fragments via sparse dynamic programming. In: Bilardi, G., Pietracaprina, A., Italiano, G.F., Pucci, G. (eds.) ESA 1998. LNCS, vol. 1461, pp. 79–90. Springer, Heidelberg (1998)

    Google Scholar 

  6. Bar-Yossef, Z., Jayram, T.S., Krauthgamer, R., Kumar, R.: Approximating edit distance efficiently. In: FOCS, pp. 550–559 (2004)

    Google Scholar 

  7. Batu, T., Ergün, F., Sahinalp, C.: Oblivious string embeddings and edit distance approximation. In: SODA, pp. 792–801 (2006)

    Google Scholar 

  8. Charikar, M., Krauthgamer, R.: Embedding the Ulam metric into ℓ1. Theory of Computing 2, 207–224 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  9. Crochemore, M., Landau, G.M., Ziv-Ukelson, M.: A sub-quadratic sequence alignment algorithm for unrestricted cost matrices. SIAM J. Comput. 32(5), 1654–1673 (2003)

    Article  MATH  Google Scholar 

  10. Crochemore, M., Porat, E.: Computing a longest increasing subsequence of length k in time o(n log log k). In: Gelenbe, E., Abramsky, S., Sassone, V. (eds.) Visions of computer science, Swindon, UK, pp. 69–74. The British Computer Society (2008)

    Google Scholar 

  11. Gusfield, D.: Algorithms on strings, trees and sequences. Cambridge University Press, Cambridge (1997)

    Book  MATH  Google Scholar 

  12. Hirschberg, D.S.: Algorithms for the longest common subsequence problem. JACM 24(4), 664–675 (1977)

    Article  MathSciNet  MATH  Google Scholar 

  13. Hunt, J.W., Szymanski, T.G.: A fast algorithm for computing longest common subsequences. CACM 20, 350–353 (1977)

    Article  MathSciNet  MATH  Google Scholar 

  14. Karp, R., Miller, R., Rosenberg, A.: Rapid identification of repeated patterns in strings, arrays and trees. Symposium on the Theory of Computing 4, 125–136 (1972)

    MATH  Google Scholar 

  15. Karp, R.M., Rabin, M.O.: Efficient randomized pattern-matching algorithms. IBM Journal of Research and Development 31(2), 249–260 (1987)

    Article  MathSciNet  MATH  Google Scholar 

  16. Landau, G.M., Scheiber, B., Ziv-Ukelson, M.: Sparse LCS commom substring alignment. Information Processing Letters 88(6), 259–270 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  17. Landau, G.M., Vishkin, U.: Fast string matching with k differences. Journal of Computer and System Sciences 37(1), 63–78 (1988)

    Article  MathSciNet  MATH  Google Scholar 

  18. Landau, G.M., Ziv-Ukelson, M.: On the common substring alignment problem. Journal of Algorithms 41(2), 338–359 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  19. Masek, W.J., Paterson, M.S.: A faster algorithm for computing string edit distances. JCSS 20, 18–31 (1980)

    MathSciNet  MATH  Google Scholar 

  20. Ostrovsky, R., Rabani, Y.: Low distortion embeddings for edit distance. In: Proceedings of the 37th annual ACM symposium on Theory of computing (STOC), pp. 218–224 (2005)

    Google Scholar 

  21. Sankoff, D.: Matching sequences under deletion/insertion constraints. Pro. Nat. Acad. Sct. USA 69, 4–6 (1972)

    Article  MathSciNet  MATH  Google Scholar 

  22. Wagner, R.A., Fischer, M.J.: The string to string correction problem. JACM 21(1), 168–173 (1974)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Landau, G.M., Levy, A., Newman, I. (2009). LCS Approximation via Embedding into Local Non-repetitive Strings. In: Kucherov, G., Ukkonen, E. (eds) Combinatorial Pattern Matching. CPM 2009. Lecture Notes in Computer Science, vol 5577. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-02441-2_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-02441-2_9

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-02440-5

  • Online ISBN: 978-3-642-02441-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics