Skip to main content

Longest Common Subsequence from Fragments via Sparse Dynamic Programming

  • Conference paper
  • First Online:
Algorithms — ESA’ 98 (ESA 1998)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1461))

Included in the following conference series:

Abstract

Sparse Dynamic Programming has emerged as an essential tool for the design of efficient algorithms for optimization problems coming from such diverse areas as Computer Science, Computational Biology and Speech Recognition [7],[11],[15]. We provide a new Sparse Dynamic Programming technique that extends the Hunt-Szymanski [2],[9],[8] paradigm for the computation of the Longest Common Subsequence (LCS) and apply it to solve the LCS from Fragments problem: given a pair of strings X and Y (of length n and m, resp.) and a set M of matching substrings of X and Y, find the longest common subsequence based only on the symbol correspondences induced by the substrings. This problem arises in an application to analysis of software systems. Our algorithm solves the problem in O(|M| log |M|) time using balanced trees, or O(|M| log log min(|M|,nm/|M|)) time using Johnson’s version of Flat Trees [10]. These bounds apply for two cost measures. The algorithm can also be adapted to finding the usual LCS in O((m + n) log |Σ| + |M|log|M|) using balanced trees or O((m + n)log|Σ| + |M|log log min(|M|; nm/|M|)) using Johnson’s Flat Trees, where M is the set of maximal matches between substrings of X and Y and Σ is the alphabet.

Work Supported in part by Grants from the Italian Ministry of Scientific Research and by the Italian National Research Council. Part of this work was done while the author was visiting Bell Laboratories of Lucent Technologies

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. A.V. Aho, J.E. Hopcroft, and J.D. Ullman. Data Structures and Algorithms. Addison-Wesley, Reading, MA., 1983.

    MATH  Google Scholar 

  2. A. Apostolico. String editing and longest common subsequence. In G. Rozenberg and A. Salomaa, editors, Handbook of Formal Languages, Vol. 2, pages 361–398, Berlin, 1997. Springer Verlag.

    Google Scholar 

  3. B. S. Baker. A theory of parameterized pattern matching: Algorithms and applications. In Proc. 25th Symposium on Theory of Computing, pages 71–80. ACM, 1993.

    Google Scholar 

  4. D. Eppstein, Z. Galil, R. Giancarlo, and G. Italiano. Sparse dynamic programming I: Linear cost functions. J. of ACM, 39:519–545, 1992.

    Article  MATH  MathSciNet  Google Scholar 

  5. D. Eppstein, Z. Galil, R. Giancarlo, and G. Italiano. Sparse dynamic programming II: Convex and concave cost functions. J. of ACM, 39:546–567, 1992.

    Article  MATH  MathSciNet  Google Scholar 

  6. M. Farach and M. Thorup. Optimal evolutionary tree comparison by sparse dynamic programming. In Proc. 35th Symposium on Foundations of Computer Science, pages 770–779. IEEE, 1994.

    Google Scholar 

  7. D. Gusfield. Algorithms on Strings, Trees and Sequences-Computer Science and Computational Biology. Cambridge University Press, Cambridge, 1997.

    MATH  Google Scholar 

  8. D.S. Hirschberg. Serial computations of Levenshtein distances. In A. Apostolico and Z. Galil, editors, Pattern Matching Algorithms, pages 123–142, Oxford, 1997. Oxford University Press.

    Google Scholar 

  9. J.W. Hunt and T.G. Szymanski. A fast algorithm for computing longest common subsequences. Comm. of the ACM, 20:350–353, 1977.

    Article  MATH  MathSciNet  Google Scholar 

  10. D. B. Johnson. A priority queue in which initialization and queue operations take O(log logD) time. Math. Sys. Th., 15:295–309, 1982.

    Article  MATH  Google Scholar 

  11. J.B. Kruskal and D. Sankoff, editors. Time Wraps, String Edits, and Macromolecules: The Theory and Practice of Sequence Comparison. Addison-Wesley, 1983.

    Google Scholar 

  12. W. Miller and E. Myers. Chaining multiple alignment fragments in sub-quadratic time. In Proc. of 6-th ACM-SIAM SODA, pages 48–57, 1995.

    Google Scholar 

  13. E. W. Myers. An O(ND) difference algorithm and its variations. Algorithmica, 1:251–266, 1986.

    Article  MATH  MathSciNet  Google Scholar 

  14. P. van Emde Boas. Preserving order in a forest in less than logarithmic time. Info. Proc. Lett., 6:80–82, 1977.

    Article  MATH  Google Scholar 

  15. M.S. Waterman. Introduction to Computational Biology. Maps, Sequences and Genomes. Chapman Hall, Los Angeles, 1995.

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1998 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Baker, B.S., Giancarlo, R. (1998). Longest Common Subsequence from Fragments via Sparse Dynamic Programming. In: Bilardi, G., Italiano, G.F., Pietracaprina, A., Pucci, G. (eds) Algorithms — ESA’ 98. ESA 1998. Lecture Notes in Computer Science, vol 1461. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-68530-8_7

Download citation

  • DOI: https://doi.org/10.1007/3-540-68530-8_7

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-64848-2

  • Online ISBN: 978-3-540-68530-2

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics