Advertisement

Analysis of Maximal Repetitions in Strings

  • Maxime Crochemore
  • Lucian Ilie
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4708)

Abstract

The cornerstone of any algorithm computing all repetitions in strings of length n in \({\mathcal O}(n)\) time is the fact that the number of maximal repetitions (runs) is linear. Therefore, the most important part of the analysis of the running time of such algorithms is counting the number of runs. Kolpakov and Kucherov [FOCS’99] proved it to be cn but could not provide any value for c. Recently, Rytter [STACS’06] proved that c ≤ 5. His analysis has been improved by Puglisi et al. to obtain 3.48 and by Rytter to 3.44 (both submitted). The conjecture of Kolpakov and Kucherov, supported by computations, is that c = 1. Here we improve dramatically the previous results by proving that c ≤ 1.6 and show how it could be improved by computer verification down to 1.18 or less. While the conjecture may be very difficult to prove, we believe that our work provides a good approximation for all practical purposes.

For the stronger result concerning the linearity of the sum of exponents, we give the first explicit bound: 5.6n. Kolpakov and Kucherov did not have any and Rytter considered “unsatisfactory” the bound that could be deduced from his proof. Our bound could be as well improved by computer verification down to 2.9n or less.

Keywords

Combinatorics on words repetitions in strings runs maximal repetitions maximal periodicities sum of exponents 

References

  1. 1.
    Apostolico, A., Preparata, F.: Optimal off-line detection of repetitions in a string. Theoret. Comput. Sci. 22(3), 297–315 (1983)zbMATHCrossRefMathSciNetGoogle Scholar
  2. 2.
    Crochemore, M.: An optimal algorithm for computing the repetitions in a word. Inform. Proc. Letters 12, 244–250 (1981)zbMATHCrossRefMathSciNetGoogle Scholar
  3. 3.
    Crochemore, M., Ilie, L.: A simple proof that the number of runs in a word is linear (manuscript, 2006)Google Scholar
  4. 4.
    Crochemore, M., Rytter, W.: Squares, cubes, and time-space efficient string searching. Algorithmica 13, 405–425 (1995)zbMATHCrossRefMathSciNetGoogle Scholar
  5. 5.
    Fraenkel, A.S., Simpson, J.: How many squares can a string contain? J. Combin. Theory, Ser. A 82, 112–120 (1998)zbMATHCrossRefMathSciNetGoogle Scholar
  6. 6.
    Franek, F., Simpson, R.J., Smyth, W.F.: The maximum number of runs in a string. In: Miller, M., Park, K. (eds.) Proc. 14th Australasian Workshop on Combinatorial Algorithms, pp. 26–35 (2003)Google Scholar
  7. 7.
    Ilie, L.: A simple proof that a word of length n has at most 2n distinct squares. J. Combin. Theory, Ser. A 112(1), 163–164 (2005)zbMATHCrossRefMathSciNetGoogle Scholar
  8. 8.
    Ilie, L.: A note on the number of squares in a word. Theoret. Comput. Sci. 380(3), 373–376 (2007)zbMATHCrossRefMathSciNetGoogle Scholar
  9. 9.
    Iliopoulos, C.S., Moore, D., Smyth, W.F.: A characterization of the squares in a Fibonacci string. Theoret. Comput. Sci. 172, 281–291 (1997)zbMATHCrossRefMathSciNetGoogle Scholar
  10. 10.
    Kolpakov, R., Kucherov, G.: On the sum of exponents of maximal repetitions in a word, Tech. Report 99-R-034, LORIA (1999)Google Scholar
  11. 11.
    Kolpakov, R., Kucherov, G.: Finding maximal repetitions in a word in linear time. In: Proc. of FOCS 1999, pp. 596–604. IEEE Computer Society Press, Los Alamitos (1999)Google Scholar
  12. 12.
    Kolpakov, R., Kucherov, G.: On maximal repetitions in words. J. Discrete Algorithms 1(1), 159–186 (2000)MathSciNetGoogle Scholar
  13. 13.
    Lothaire, M.: Algebraic Combinatorics on Words. Cambridge Univ. Press, Cambridge (2002)zbMATHGoogle Scholar
  14. 14.
    Lothaire, M.: Applied Combinatorics on Words. Cambridge Univ. Press, Cambridge (2005)zbMATHGoogle Scholar
  15. 15.
    Main, M.G.: Detecting lefmost maximal periodicities. Discrete Applied Math. 25, 145–153 (1989)zbMATHCrossRefMathSciNetGoogle Scholar
  16. 16.
    Puglisi, S.J., Simpson, J., Smyth, B.: How many runs can a string contain? (submitted, 2006)Google Scholar
  17. 17.
    Rytter, W.: The number of runs in a string: improved analysis of the linear upper bound. In: Durand, B., Thomas, W. (eds.) STACS 2006. LNCS, vol. 3884, pp. 184–195. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  18. 18.
    Rytter, W.: The number of runs in a string (submitted, 2006)Google Scholar
  19. 19.
    Stoye, J., Gusfield, D.: Simple and flexible detection of contiguous repeats using a suffix tree. In: Farach-Colton, M. (ed.) CPM 1998. LNCS, vol. 1448, pp. 140–152. Springer, Heidelberg (1998)CrossRefGoogle Scholar
  20. 20.
    Thue, A., Zeichenreihen, Ü.u.: Kra. Vidensk. Selsk. Skrifter. I. Mat.-Nat. Kl. Cristiana 7 (1906)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  • Maxime Crochemore
    • 1
  • Lucian Ilie
    • 2
  1. 1.Institut Gaspard-Monge, Université de Marne-la-Vallée, 77454 Marne-la-Vallée, Cedex 2France
  2. 2.Department of Computer Science, University of Western Ontario, N6A 5B7, London, OntarioCanada

Personalised recommendations