Weighted Shortest Common Supersequence

  • Amihood Amir
  • Zvi Gotthilf
  • B. Riva Shalom
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7024)

Abstract

The Shortest Common Supersequence (SCS) is the problem of seeking a shortest possible sequence that contains each of the input sequences as a subsequence. In this paper we consider applying the problem to Position Weight Matrices (PWM). The Position Weight Matrix was introduced as a tool to handle a set of sequences that are not identical, yet, have many local similarities. Such a weighted sequence is a ‘statistical image’ of this set where we are given the probability of every symbol’s occurrence at every text location. We consider two possible definitions of SCS on PWM. For the first, we give a polynomial time algorithm, having two input sequences. For the second, we prove \(\cal{NP}\)-hardness.

Keywords

Input Sequence Partition Problem Input String Position Weight Matrix Text Location 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Amir, A., Chencinski, E., Iliopoulos, C.S., Kopelowitz, T., Zhang, H.: Property Matching and Weighted Matching. Theor. Comput. Sci. 395(2-3), 298–310 (2008)MathSciNetCrossRefMATHGoogle Scholar
  2. 2.
    Amir, A., Gotthilf, Z., Shalom, R.: Weighted LCS. J. Discrete Algorithms 8(3), 273–281 (2010)MathSciNetCrossRefMATHGoogle Scholar
  3. 3.
    Amir, A., Iliopoulos, C.S., Kapah, O., Porat, E.: Approximate Matching in Weighted Sequences. In: Lewenstein, M., Valiente, G. (eds.) CPM 2006. LNCS, vol. 4009, pp. 365–376. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  4. 4.
    Apostolico, A., Landau, G.M., Skiena, S.: Matching for run-length encoded strings. Journal of Complexity 15(1), 4–16 (1999)MathSciNetCrossRefMATHGoogle Scholar
  5. 5.
    Clifford, R., Gotthilf, Z., Lewenstein, M., Popa, A.: Restricted common superstring and restricted common supersequence. In: Giancarlo, R., Manzini, G. (eds.) CPM 2011. LNCS, vol. 6661, pp. 467–478. Springer, Heidelberg (to appear, 2011)CrossRefGoogle Scholar
  6. 6.
    Garey, M.R., Johnson, D.S.: Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman and Co., New York (1979)MATHGoogle Scholar
  7. 7.
    Gotthilf, Z., Lewenstein, M.: Improved Approximation Results on the Shortest Common Supersequence Problem. In: Karlgren, J., Tarhio, J., Hyyrö, H. (eds.) SPIRE 2009. LNCS, vol. 5721, pp. 277–284. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  8. 8.
    Iliopoulos, C., Makris, C., Panagis, Y., Perdikuri, K., Theodoridis, E., Tsakalidis, A.K.: Efficient Algorithms for Handling Molecular Weighted Sequences. In: IFIP TCS, pp. 265–278 (2004)Google Scholar
  9. 9.
    Iliopoulos, C.S., Mouchard, L., Pedikuri, K., Tsakalidis, A.K.: Computing the repetitions in a weighted sequence. In: Proc. of the 2003 Prague Stringology Conference (PSC 2003), vol. 10, pp. 91–98 (2003)Google Scholar
  10. 10.
    Jiang, T., Li, M.: On the Approximation of Shortest Common Supersequences and Longest Common Subsequences. SIAM Journal on Computing 24(5), 1122–1139 (1995)MathSciNetCrossRefMATHGoogle Scholar
  11. 11.
    Maier, D.: The Complexity of Some Problems on Subsequences and Supersequences. Journal of the ACM 25(2), 322–336 (1978)MathSciNetCrossRefMATHGoogle Scholar
  12. 12.
    Thompson, J.D., Higgins, D.G., Gibson, T.J.: CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22, 4673–4680 (1994)CrossRefGoogle Scholar
  13. 13.
    Räihä, K.-J., Ukkonen, E.: The Shortest Common Supersequence Problem over Binary Alphabet is NP-complete. Theoretical Computer Science 16(2), 187–198 (1981)MathSciNetCrossRefMATHGoogle Scholar
  14. 14.
    Sankoff, D.: Minimal Mutation Trees of Sequences. SIAM Journal on Applied Mathematics 28, 35–42 (1975)MathSciNetCrossRefMATHGoogle Scholar
  15. 15.
    Timkovsky, V.G.: Complexity of common subsequence and supersequence problems and related problems. Kibernetika 25, 565–580 (1989); English Translation in Cybernetics 25: 565-580, 1990MathSciNetGoogle Scholar
  16. 16.
    Venter, J.C., Celera Genomics Corporation: The Sequence of the Human Genome. Science 291, 1304–1351 (2001)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Amihood Amir
    • 1
    • 2
  • Zvi Gotthilf
    • 1
  • B. Riva Shalom
    • 3
  1. 1.Department of Computer ScienceBar-Ilan UniversityRamat-GanIsrael
  2. 2.Department of Computer ScienceJohns Hopkins UniversityBaltimoreIsrael
  3. 3.Department of Software EngineeringShenkar CollegeRamat-GanIsrael

Personalised recommendations