Advertisement

Efficient Computation of Palindromes in Sequences with Uncertainties

  • Mai AlzamelEmail author
  • Jia Gao
  • Costas S. Iliopoulos
  • Chang Liu
  • Solon P. Pissis
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 744)

Abstract

In this work, we consider a special type of uncertain sequence called weighted string. In a weighted string every position contains a subset of the alphabet and every letter of the alphabet is associated with a probability of occurrence such that the sum of probabilities at each position equals 1. Usually a cumulative weight threshold Open image in new window is specified, and one considers only strings that match the weighted string with probability at least Open image in new window . We provide an \(\mathcal {O}(nz)\)-time and \(\mathcal {O}(nz)\)-space off-line algorithm, where n is the length of the weighted string and Open image in new window is the given threshold, to compute a smallest maximal palindromic factorization of a weighted string. This factorization has applications in hairpin structure prediction in a set of closely-related DNA or RNA sequences. Along the way, we provide an \(\mathcal {O}(nz)\)-time and \(\mathcal {O}(nz)\)-space off-line algorithm to compute maximal palindromes in weighted strings.

References

  1. 1.
    Alatabbi, A., Iliopoulos, C.S., Rahman, M.S.: Maximal palindromic factorization. In: PSC, pp. 70–77 (2013)Google Scholar
  2. 2.
    Almirantis, Y., Charalampopoulos, P., Gao, J., Iliopoulos, C.S., Mohamed, M., Pissis, S.P., Polychronopoulos, D.: On avoided words, absent words, and their application to biological sequence analysis. Algorithms Mol. Biol. 12(1), 5 (2017)CrossRefGoogle Scholar
  3. 3.
    Amir, A., Gotthilf, Z., Shalom, B.R.: Weighted LCS. J. Discrete Algorithms 8(3), 273–281 (2010)Google Scholar
  4. 4.
    Apostolico, A., Breslauer, D., Galil, Z.: Parallel detection of all palindromes in a string. Theoret. Comput. Sci. 141(1), 163–173 (1995)MathSciNetCrossRefzbMATHGoogle Scholar
  5. 5.
    Barton, C., Iliopoulos, C.S., Pissis, S.P.: Optimal computation of all tandem repeats in a weighted sequence. Algorithms Mol. Biol. 9(21), 21 (2014)CrossRefGoogle Scholar
  6. 6.
    Barton, C., Kociumaka, T., Liu, C., Pissis, S.P., Radoszewski, J.: Indexing Weighted Sequences: Neat and Efficient. CoRR, abs/1704.07625 (2017)Google Scholar
  7. 7.
    Barton, C., Kociumaka, T., Pissis, S.P., Radoszewski, J.: Efficient index for weighted sequences. In: CPM. LIPIcs, vol. 54, pp. 4:1–4:13. Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik (2016)Google Scholar
  8. 8.
    Barton, C., Liu, C., Pissis, S.P.: Linear-time computation of prefix table for weighted strings and applications. Theoret. Comput. Sci. 656, 160–172 (2016)MathSciNetCrossRefzbMATHGoogle Scholar
  9. 9.
    Barton, C., Liu, C., Pissis, S.P.: On-line pattern matching on uncertain sequences and applications. In: Chan, T.-H.H., Li, M., Wang, L. (eds.) COCOA 2016. LNCS, vol. 10043, pp. 547–562. Springer, Cham (2016). doi: 10.1007/978-3-319-48749-6_40 CrossRefGoogle Scholar
  10. 10.
    Barton, C., Pissis, S.P.: Crochemore’s partitioning on weighted strings and applications. Algorithmica (2017). doi: 10.1007/s00453-016-0266-0
  11. 11.
    Bender, M.A., Farach-Colton, M.: The LCA problem revisited. In: Gonnet, G.H., Viola, A. (eds.) LATIN 2000. LNCS, vol. 1776, pp. 88–94. Springer, Heidelberg (2000). doi: 10.1007/10719839_9 CrossRefGoogle Scholar
  12. 12.
    Cygan, M., Kubica, M., Radoszewski, J., Rytter, W., Walen, T.: Polynomial-time approximation algorithms for weighted LCS problem. Discrete Appl. Math. 204, 38–48 (2016)MathSciNetCrossRefzbMATHGoogle Scholar
  13. 13.
    Farach, M.: Optimal suffix tree construction with large alphabets. In: FOCS, pp. 137–143. IEEE Computer Society (1997)Google Scholar
  14. 14.
    Fici, G., Gagie, T., Kärkkäinen, J., Kempa, D.: A subquadratic algorithm for minimum palindromic factorization. J. Discrete Algorithms 28, 41–48 (2014)MathSciNetCrossRefzbMATHGoogle Scholar
  15. 15.
    Gusfield, D.: Algorithms on Strings, Trees, and Sequences: Computer Science and Computational Biology. Cambridge University Press, New York (1997)CrossRefzbMATHGoogle Scholar
  16. 16.
    Tomohiro, I., Sugimoto, S., Inenaga, S., Bannai, H., Takeda, M.: Computing palindromic factorizations and palindromic covers on-line. In: Kulikov, A.S., Kuznetsov, S.O., Pevzner, P. (eds.) CPM 2014. LNCS, vol. 8486, pp. 150–161. Springer, Cham (2014). doi: 10.1007/978-3-319-07566-2_16 Google Scholar
  17. 17.
    Iliopoulos, C.S., Makris, C., Panagis, Y., Perdikuri, K., Theodoridis, E., Tsakalidis, A.: The weighted suffix tree: an efficient data structure for handling molecular weighted sequences and its applications. Fundamenta Informaticae 71(2, 3), 259–277 (2006)Google Scholar
  18. 18.
    Kociumaka, T., Pissis, S.P., Radoszewski, J.: Pattern matching and consensus problems on weighted sequences and profiles. In: ISAAC. LIPIcs, vol. 64, pp. 46:1–46:12. Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik (2016)Google Scholar
  19. 19.
    Manacher, G.: A new linear-time “on-line" algorithm for finding the smallest initial palindrome of a string. J. ACM 22(3), 346–351 (1975)CrossRefzbMATHGoogle Scholar
  20. 20.
    Muhire, B.M., Golden, M., Murrell, B., Lefeuvre, P., Lett, J.-M., Gray, A., Poon, A.Y.F., Ngandu, N.K., Semegni, Y., Tanov, E.P., et al.: Evidence of pervasive biologically functional secondary structures within the genomes of eukaryotic single-stranded DNA viruses. J. Virol. 88(4), 1972–1989 (2014)CrossRefGoogle Scholar
  21. 21.
    Rubinchik, M., Shur, A.M.: EERTREE: an efficient data structure for processing palindromes in strings. In: Lipták, Z., Smyth, W.F. (eds.) IWOCA 2015. LNCS, vol. 9538, pp. 321–333. Springer, Cham (2016). doi: 10.1007/978-3-319-29516-9_27 CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Mai Alzamel
    • 1
    Email author
  • Jia Gao
    • 1
  • Costas S. Iliopoulos
    • 1
  • Chang Liu
    • 1
  • Solon P. Pissis
    • 1
  1. 1.Department of InformaticsKing’s College LondonLondonUK

Personalised recommendations