Advertisement

Inferring Strings from Lyndon Factorization

  • Yuto Nakashima
  • Takashi Okabe
  • Tomohiro I
  • Shunsuke Inenaga
  • Hideo Bannai
  • Masayuki Takeda
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8635)

Abstract

The Lyndon factorization of a string w is a unique factorization \(\ell_1^{p_1}, \ldots, \ell_m^{p_m}\) of w s.t. ℓ1, …, ℓ m is a sequence of Lyndon words that is monotonically decreasing in lexicographic order. In this paper, we consider the reverse-engineering problem on Lyndon factorization: Given a sequence S = ((s 1, p 1), …, (s m , p m )) of ordered pairs of positive integers, find a string w whose Lyndon factorization corresponds to the input sequence S, i.e., the Lyndon factorization of w is in a form of \(\ell_1^{p_1}, \ldots, \ell_m^{p_m}\) with |ℓ i | = s i for all 1 ≤ i ≤ m. Firstly, we show that there exists a simple O(n)-time algorithm if the size of the alphabet is unbounded, where n is the length of the output string. Secondly, we present an O(n)-time algorithm to compute a string over an alphabet of the smallest size. Thirdly, we show how to compute only the size of the smallest alphabet in O(m) time. Fourthly, we give an O(m)-time algorithm to compute an O(m)-size representation of a string over an alphabet of the smallest size. Finally, we propose an efficient algorithm to enumerate all strings whose Lyndon factorizations correspond to S.

Keywords

Time Algorithm Input Sequence Compact Representation String Match Large Position 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Apostolico, A., Crochemore, M.: Fast parallel Lyndon factorization with applications. Mathematical Systems Theory 28(2), 89–108 (1995)CrossRefMATHMathSciNetGoogle Scholar
  2. 2.
    Bannai, H., Inenaga, S., Shinohara, A., Takeda, M.: Inferring strings from graphs and arrays. In: Rovan, B., Vojtáš, P. (eds.) MFCS 2003. LNCS, vol. 2747, pp. 208–217. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  3. 3.
    Brlek, S., Lachaud, J.O., Provençal, X., Reutenauer, C.: Lyndon + Christoffel = digitally convex. Pattern Recognition 42(10), 2239–2246 (2009)CrossRefMATHGoogle Scholar
  4. 4.
    Chemillier, M.: Periodic musical sequences and Lyndon words. Soft Comput. 8(9), 611–616 (2004)MATHGoogle Scholar
  5. 5.
    Chen, K.T., Fox, R.H., Lyndon, R.C.: Free differential calculus. IV. the quotient groups of the lower central series. Annals of Mathematics 68(1), 81–95 (1958)CrossRefMathSciNetGoogle Scholar
  6. 6.
    Crochemore, M., Perrin, D.: Two-way string matching. J. ACM 38(3), 651–675 (1991)CrossRefMATHMathSciNetGoogle Scholar
  7. 7.
    Daykin, J.W., Iliopoulos, C.S., Smyth, W.F.: Parallel RAM algorithms for factorizing words. Theor. Comput. Sci. 127(1), 53–67 (1994)CrossRefMATHMathSciNetGoogle Scholar
  8. 8.
    Delgrange, O., Rivals, E.: STAR: an algorithm to search for tandem approximate repeats. Bioinformatics 20(16), 2812–2820 (2004)CrossRefGoogle Scholar
  9. 9.
    Duval, J.P.: Factorizing words over an ordered alphabet. J. Algorithms 4(4), 363–381 (1983)CrossRefMATHMathSciNetGoogle Scholar
  10. 10.
    Duval, J.P.: Génération d’une section des classes de conjugaison et arbre des mots de Lyndon de longueur bornée. Theor. Comput. Sci. 60, 255–283 (1988)CrossRefMATHMathSciNetGoogle Scholar
  11. 11.
    Duval, J.P., Lecroq, T., Lefebvre, A.: Border array on bounded alphabet. Journal of Automata, Languages and Combinatorics 10(1), 51–60 (2005)MATHMathSciNetGoogle Scholar
  12. 12.
    Duval, J.P., Lecroq, T., Lefebvre, A.: Efficient validation and construction of border arrays and validation of string matching automata. RAIRO - Theoretical Informatics and Applications 43(2), 281–297 (2009)CrossRefMATHMathSciNetGoogle Scholar
  13. 13.
    Duval, J.P., Lefebvre, A.: Words over an ordered alphabet and suffix permutations. Theoretical Informatics and Applications 36, 249–259 (2002)CrossRefMATHMathSciNetGoogle Scholar
  14. 14.
    Franek, F., Gao, S., Lu, W., Ryan, P.J., Smyth, W.F., Sun, Y., Yang, L.: Verifying a border array in linear time. J. Comb. Math. and Comb. Comp. 42, 223–236 (2002)MATHMathSciNetGoogle Scholar
  15. 15.
    Gawrychowski, P., Jeż, A., Jeż, Ł.: Validating the Knuth-Morris-Pratt failure function, fast and online. Theory Comput. Syst. 54(2), 337–372 (2014)CrossRefGoogle Scholar
  16. 16.
    Gil, J.Y., Scott, D.A.: A bijective string sorting transform. CoRR abs/1201.3077 (2012)Google Scholar
  17. 17.
    He, J., Liang, H., Yang, G.: Reversing longest previous factor tables is hard. In: Dehne, F., Iacono, J., Sack, J.-R. (eds.) WADS 2011. LNCS, vol. 6844, pp. 488–499. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  18. 18.
    I, T., Inenaga, S., Bannai, H., Takeda, M.: Counting and verifying maximal palindromes. In: Chavez, E., Lonardi, S. (eds.) SPIRE 2010. LNCS, vol. 6393, pp. 135–146. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  19. 19.
    I, T., Inenaga, S., Bannai, H., Takeda, M.: Inferring strings from suffix trees and links on a binary alphabet. In: Proc. PSC 2011, pp. 121–130 (2011)Google Scholar
  20. 20.
    I, T., Nakashima, Y., Inenaga, S., Bannai, H., Takeda, M.: Efficient Lyndon factorization of grammar compressed text. In: Fischer, J., Sanders, P. (eds.) CPM 2013. LNCS, vol. 7922, pp. 153–164. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  21. 21.
    I, T., Nakashima, Y., Inenaga, S., Bannai, H., Takeda, M.: Faster lyndon factorization algorithms for SLP and LZ78 compressed text. In: Kurland, O., Lewenstein, M., Porat, E. (eds.) SPIRE 2013. LNCS, vol. 8214, pp. 174–185. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  22. 22.
    Kufleitner, M.: On bijective variants of the Burrows-Wheeler transform. In: Proc. PSC 2009, pp. 65–79 (2009)Google Scholar
  23. 23.
    Lyndon, R.C.: On Burnside’s problem. Transactions of the American Mathematical Society 77, 202–215 (1954)MATHMathSciNetGoogle Scholar
  24. 24.
    Matsubara, W., Ishino, A., Shinohara, A.: Inferring strings from runs. In: Proc. PSC 2010, pp. 150–160 (2010)Google Scholar
  25. 25.
    Moore, D., Smyth, W.F., Miller, D.: Counting distinct strings. Algorithmica 23(1), 1–13 (1999)CrossRefMATHMathSciNetGoogle Scholar
  26. 26.
    Schürmann, K.B., Stoye, J.: Counting suffix arrays and strings. Theoretical Computer Science 395(2-3), 220–234 (2008)CrossRefMATHMathSciNetGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2014

Authors and Affiliations

  • Yuto Nakashima
    • 1
  • Takashi Okabe
    • 1
  • Tomohiro I
    • 2
  • Shunsuke Inenaga
    • 1
  • Hideo Bannai
    • 1
  • Masayuki Takeda
    • 1
  1. 1.Department of InformaticsKyushu UniversityJapan
  2. 2.Department of Computer ScienceTU DortmundGermany

Personalised recommendations