Advertisement

Longest Property-Preserved Common Factor

  • Lorraine A. K. Ayad
  • Giulia Bernardini
  • Roberto Grossi
  • Costas S. Iliopoulos
  • Nadia PisantiEmail author
  • Solon P. Pissis
  • Giovanna Rosone
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11147)

Abstract

In this paper we introduce a new family of string processing problems. We are given two or more strings and we are asked to compute a factor common to all strings that preserves a specific property and has maximal length. Here we consider two fundamental string properties: square-free factors and periodic factors under two different settings, one per property. In the first setting, we are given a string x and we are asked to construct a data structure over x answering the following type of on-line queries: given string y, find a longest square-free factor common to x and y. In the second setting, we are given k strings and an integer \(1 < k'\le k\) and we are asked to find a longest periodic factor common to at least \(k'\) strings. We present linear-time solutions for both settings. We anticipate that our paradigm can be extended to other string properties.

Keywords

Longest common factor Periodicity Squares Algorithms 

Notes

Acknowledgements

Solon P. Pissis and Giovanna Rosone are partially supported by the Royal Society project IE 161274 “Processing uncertain sequences: combinatorics and applications”. Giovanna Rosone and Nadia Pisanti are partially supported by the project Italian MIUR-SIR CMACBioSeq (“Combinatorial methods for analysis and compression of biological sequences”) grant n. RBSI146R5L.

References

  1. 1.
    Ayad, L.A.K., Barton, C., Charalampopoulos, P., Iliopoulos, C.S., Pissis, S.P.: Longest common prefixes with \(k\)-errors and applications. In: Gagie, T., et al. (eds.) SPIRE 2018. LNCS, vol. 11147, pp. 27–41. Springer, Heidelberg (2018)Google Scholar
  2. 2.
    Bae, S.W., Lee, I.: On finding a longest common palindromic subsequence. Theor Comput Sci 710, 29–34 (2018). Advances in Algorithms and Combinatorics on Strings (Honoring 60th birthday for Prof. Costas S, Iliopoulos)MathSciNetCrossRefGoogle Scholar
  3. 3.
    Bannai, H., I, T., Inenaga, S., Nakashima, Y., Takeda, M., Tsuruta, K.: The “runs” theorem. SIAM J. Comput. 46(5), 1501–1514 (2017)MathSciNetCrossRefGoogle Scholar
  4. 4.
    Barton, C., Kociumaka, T., Liu, C., Pissis, S.P., Radoszewski, J.: Indexing weighted sequences: neat and efficient. CoRR, arXiv:abs/1704.07625 (2017)
  5. 5.
    Belazzougui, D., Cunial, F.: Indexed matching statistics and shortest unique substrings. In: Moura, E., Crochemore, M. (eds.) SPIRE 2014. LNCS, vol. 8799, pp. 179–190. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-11918-2_18CrossRefGoogle Scholar
  6. 6.
    Chang, W.I., Lawler, E.L.: Sublinear approximate string matching and biological applications. Algorithmica 12(4), 327–344 (1994)MathSciNetCrossRefGoogle Scholar
  7. 7.
    Charalampopoulos, P., et al.: Linear-time algorithm for long LCF with K mismatches. In: CPM. LIPIcs, vol. 105, pp. 23:1–23:16. Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik (2018)Google Scholar
  8. 8.
    Chi, L., Hui, K.: Color set size problem with applications to string matching. In: Apostolico, A., Crochemore, M., Galil, Z., Manber, U. (eds.) CPM 1992. LNCS, vol. 644, pp. 230–243. Springer, Heidelberg (1992).  https://doi.org/10.1007/3-540-56024-6_19CrossRefGoogle Scholar
  9. 9.
    Chowdhury, S.R., Hasan, M.M., Iqbal, S., Rahman, M.S.: Computing a longest common palindromic subsequence. Fundam. Inf. 129(4), 329–340 (2014)MathSciNetzbMATHGoogle Scholar
  10. 10.
    Dumitran, M., Manea, F., Nowotka, D.: On prefix/suffix-square free words. In: Iliopoulos, C., Puglisi, S., Yilmaz, E. (eds.) SPIRE 2015. LNCS, vol. 9309, pp. 54–66. Springer, Cham (2015).  https://doi.org/10.1007/978-3-319-23826-5_6CrossRefGoogle Scholar
  11. 11.
    Duval, J.-P., Kolpakov, R., Kucherov, G., Lecroq, T., Lefebvre, A.: Linear-time computation of local periods. Theor. Comput. Sci. 326(1), 229–240 (2004)MathSciNetCrossRefGoogle Scholar
  12. 12.
    Farach, M.: Optimal suffix tree construction with large alphabets. In: 38th Annual Symposium on Foundations of Computer Science (FOCS), pp. 137–143 (1997)Google Scholar
  13. 13.
    Farach, M., Muthukrishnan, S.: Perfect hashing for strings: formalization and algorithms. In: Hirschberg, D., Myers, G. (eds.) CPM 1996. LNCS, vol. 1075, pp. 130–140. Springer, Heidelberg (1996).  https://doi.org/10.1007/3-540-61258-0_11CrossRefGoogle Scholar
  14. 14.
    Federico, M., Pisanti, N.: Suffix tree characterization of maximal motifs in biological sequences. Theor. Comput. Sci. 410(43), 4391–4401 (2009)MathSciNetCrossRefGoogle Scholar
  15. 15.
    Gusfield, D.: Algorithms on Strings, Trees, and Sequences - Computer Science and Computational Biology. Cambridge University Press, Cambridge (1997)CrossRefGoogle Scholar
  16. 16.
    Inenaga, S., Hyyrö, H.: A hardness result and new algorithm for the longest common palindromic subsequence problem. Inf. Process. Lett. 129, 11–15 (2018)MathSciNetCrossRefGoogle Scholar
  17. 17.
    Inoue, T., Inenaga, S., Hyyrö, H., Bannai, H., Takeda, M.: Computing longest common square subsequences. In: 29th Symposium on Combinatorial Pattern Matching (CPM), LIPIcs, vol. 105, pp. 15:1–15:13 (2018)Google Scholar
  18. 18.
    Kociumaka, T., Starikovskaya, T., Vildhøj, H.W.: Sublinear space algorithms for the longest common substring problem. In: Schulz, A.S., Wagner, D. (eds.) ESA 2014. LNCS, vol. 8737, pp. 605–617. Springer, Heidelberg (2014).  https://doi.org/10.1007/978-3-662-44777-2_50CrossRefzbMATHGoogle Scholar
  19. 19.
    Kolpakov, R., Kucherov, G.: Finding maximal repetitions in a word in linear time. In: 40th Symposium on Foundations of Comp Science, pp. 596–604 (1999)Google Scholar
  20. 20.
    Lothaire, M.: Applied Combinatorics on Words. Encyclopedia of Mathematics and its Applications. Cambridge University Press, Cambridge (2005)CrossRefGoogle Scholar
  21. 21.
    Peterlongo, P., Pisanti, N., Boyer, F., do Lago, A.P., Sagot, M.: Lossless filter for multiple repetitions with hamming distance. J. Discr. Alg. 6(3), 497–509 (2008)MathSciNetCrossRefGoogle Scholar
  22. 22.
    Peterlongo, P., Pisanti, N., Boyer, F., Sagot, M.-F.: Lossless filter for finding long multiple approximate repetitions using a new data structure, the Bi-factor array. In: Consens, M., Navarro, G. (eds.) SPIRE 2005. LNCS, vol. 3772, pp. 179–190. Springer, Heidelberg (2005).  https://doi.org/10.1007/11575832_20CrossRefGoogle Scholar
  23. 23.
    Starikovskaya, T., Vildhøj, H.W.: Time-space trade-offs for the longest common substring problem. In: Fischer, J., Sanders, P. (eds.) CPM 2013. LNCS, vol. 7922, pp. 223–234. Springer, Heidelberg (2013).  https://doi.org/10.1007/978-3-642-38905-4_22CrossRefGoogle Scholar
  24. 24.
    Thankachan, S.V., Aluru, C., Chockalingam, S.P., Aluru, S.: Algorithmic framework for approximate matching under bounded edits with applications to sequence analysis. In: Raphael, B.J. (ed.) RECOMB 2018. LNCS, vol. 10812, pp. 211–224. Springer, Cham (2018).  https://doi.org/10.1007/978-3-319-89929-9_14CrossRefGoogle Scholar
  25. 25.
    Thankachan, S.V., Apostolico, A., Aluru, S.: A provably efficient algorithm for the k-mismatch average common substring problem. J. Comput. Biol. 23(6), 472–482 (2016)MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  • Lorraine A. K. Ayad
    • 1
  • Giulia Bernardini
    • 2
  • Roberto Grossi
    • 3
  • Costas S. Iliopoulos
    • 1
  • Nadia Pisanti
    • 3
    • 4
    Email author
  • Solon P. Pissis
    • 1
  • Giovanna Rosone
    • 3
  1. 1.Department of InformaticsKing’s College LondonLondonUK
  2. 2.Department of Informatics, Systems and CommunicationUniversity of Milan-BicoccaMilanItaly
  3. 3.Department of Computer ScienceUniversity of PisaPisaItaly
  4. 4.ERABLE TeamINRIALyonFrance

Personalised recommendations