Skip to main content

Polynomial-Time Approximation Algorithms for Weighted LCS Problem

  • Conference paper
Book cover Combinatorial Pattern Matching (CPM 2011)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6661))

Included in the following conference series:

  • 1069 Accesses

Abstract

We deal with a variant of the well-known Longest Common Subsequence (LCS) problem for weighted sequences. A (biological) weighted sequence determines the probability for each symbol to occur at a given position of the sequence (such sequences are also called Position Weighted Matrices, PWM). Two possible such versions of the problem were proposed by (Amir et al., 2009 and 2010), they are called LCWS and LCWS2 (Longest Common Weighted Subsequence 1 and 2 Problem). We solve an open problem, stated in conclusions of the paper by Amir et al., of the tractability of a log-probability version of LCWS2 problem for bounded alphabets, showing that it is NP-hard already for an alphabet of size 2. We also improve the (1/|Σ|)-approximation algorithm given by Amir et al. (where Σ is the alphabet): we show a polynomial-time approximation scheme (PTAS) for the LCWS2 problem using O(n 5) space. We also give a simpler (1/2)-approximation algorithm for the same problem using only O(n 2) space.

The first author is supported by grant no. N206 355636 of the Polish Ministry of Science and Higher Education. The third author is supported by grant no. N206 568540 of the National Science Centre. The fourth author is supported by grant no. N206 566740 of the National Science Centre.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Amir, A., Chencinski, E., Iliopoulos, C.S., Kopelowitz, T., Zhang, H.: Property matching and weighted matching. Theor. Comput. Sci. 395(2-3), 298–310 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  2. Amir, A., Gotthilf, Z., Shalom, B.R.: Weighted LCS. J. Discrete Algorithms 8, 273–281 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  3. Amir, A., Iliopoulos, C.S., Kapah, O., Porat, E.: Approximate matching in weighted sequences. In: Lewenstein, M., Valiente, G. (eds.) CPM 2006. LNCS, vol. 4009, pp. 365–376. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  4. Antoniou, P., Iliopoulos, C.S., Mouchard, L., Pissis, S.P.: Algorithms for mapping short degenerate and weighted sequences to a reference genome. I. J. Computational Biology and Drug Design 2(4), 385–397 (2009)

    Article  Google Scholar 

  5. Bergroth, L., Hakonen, H., Raita, T.: A survey of longest common subsequence algorithms. In: SPIRE, pp. 39–48 (2000)

    Google Scholar 

  6. Christodoulakis, M., Iliopoulos, C.S., Mouchard, L., Perdikuri, K., Tsakalidis, A.K., Tsichlas, K.: Computation of repetitions and regularities of biologically weighted sequences. Journal of Computational Biology 13(6), 1214–1231 (2006)

    Article  MathSciNet  Google Scholar 

  7. Crochemore, M.: An optimal algorithm for computing the repetitions in a word. Inf. Process. Lett. 12(5), 244–250 (1981)

    Article  MathSciNet  MATH  Google Scholar 

  8. Crochemore, M., Rytter, W.: Jewels of Stringology. World Scientific, Singapore (2003)

    MATH  Google Scholar 

  9. Garey, M.R., Johnson, D.S.: Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman, New York (1979)

    MATH  Google Scholar 

  10. Gusfield, D.: Algorithms on Strings, Trees, and Sequences – Computer Science and Computational Biology. Cambridge University Press, Cambridge (1997)

    Book  MATH  Google Scholar 

  11. Iliopoulos, C.S., Makris, C., Panagis, Y., Perdikuri, K., Theodoridis, E., Tsakalidis, A.K.: The weighted suffix tree: An efficient data structure for handling molecular weighted sequences and its applications. Fundam. Inform. 71(2-3), 259–277 (2006)

    MathSciNet  MATH  Google Scholar 

  12. Iliopoulos, C.S., Miller, M., Pissis, S.P.: Parallel algorithms for degenerate and weighted sequences derived from high throughput sequencing technologies. In: Holub, J., Zdárek, J. (eds.) Stringology, pp. 249–262. Prague Stringology Club, Department of Computer Science and Engineering, Faculty of Electrical Engineering, Czech Technical University in Prague (2009)

    Google Scholar 

  13. Iliopoulos, C.S., Mouchard, L., Perdikuri, K., Tsakalidis, A.K.: Computing the repetitions in a biological weighted sequence. Journal of Automata, Languages and Combinatorics 10(5/6), 687–696 (2005)

    MathSciNet  MATH  Google Scholar 

  14. Iliopoulos, C.S., Perdikuri, K., Theodoridis, E., Tsakalidis, A., Tsichlas, K.: Motif extraction from weighted sequences. In: Apostolico, A., Melucci, M. (eds.) SPIRE 2004. LNCS, vol. 3246, pp. 286–297. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  15. Myers, E.W., Celera Genomics Corporation: A whole-genome assembly of drosophila 287(5461), 2196–2204 (2000)

    Google Scholar 

  16. Thompson, J.D., Higgins, D.G., Gibson, T.J.: CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, positions-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22, 4673–4680 (1994)

    Article  Google Scholar 

  17. Venter, J.C., Celera Genomics Corporation: The sequence of the human genome. Science 291, 1304–1351 (2001)

    Article  Google Scholar 

  18. Zhang, H., Guo, Q., Fan, J., Iliopoulos, C.S.: Loose and strict repeats in weighted sequences of proteins. Protein and Peptide Letters 17(9), 1136–1142(7) (2010)

    Article  Google Scholar 

  19. Zhang, H., Guo, Q., Iliopoulos, C.S.: String matching with swaps in a weighted sequence. In: Zhang, J., He, J.-H., Fu, Y. (eds.) CIS 2004. LNCS, vol. 3314, pp. 698–704. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  20. Zhang, H., Guo, Q., Iliopoulos, C.S.: An algorithmic framework for motif discovery problems in weighted sequences. In: Calamoneri, T., Diaz, J. (eds.) CIAC 2010. LNCS, vol. 6078, pp. 335–346. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  21. Zhang, H., Guo, Q., Iliopoulos, C.S.: Varieties of regularities in weighted sequences. In: Chen, B. (ed.) AAIM 2010. LNCS, vol. 6124, pp. 271–280. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Cygan, M., Kubica, M., Radoszewski, J., Rytter, W., Waleń, T. (2011). Polynomial-Time Approximation Algorithms for Weighted LCS Problem. In: Giancarlo, R., Manzini, G. (eds) Combinatorial Pattern Matching. CPM 2011. Lecture Notes in Computer Science, vol 6661. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-21458-5_38

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-21458-5_38

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-21457-8

  • Online ISBN: 978-3-642-21458-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics