Skip to main content

Analysis of Maximal Repetitions in Strings

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4708))

Abstract

The cornerstone of any algorithm computing all repetitions in strings of length n in \({\mathcal O}(n)\) time is the fact that the number of maximal repetitions (runs) is linear. Therefore, the most important part of the analysis of the running time of such algorithms is counting the number of runs. Kolpakov and Kucherov [FOCS’99] proved it to be cn but could not provide any value for c. Recently, Rytter [STACS’06] proved that c ≤ 5. His analysis has been improved by Puglisi et al. to obtain 3.48 and by Rytter to 3.44 (both submitted). The conjecture of Kolpakov and Kucherov, supported by computations, is that c = 1. Here we improve dramatically the previous results by proving that c ≤ 1.6 and show how it could be improved by computer verification down to 1.18 or less. While the conjecture may be very difficult to prove, we believe that our work provides a good approximation for all practical purposes.

For the stronger result concerning the linearity of the sum of exponents, we give the first explicit bound: 5.6n. Kolpakov and Kucherov did not have any and Rytter considered “unsatisfactory” the bound that could be deduced from his proof. Our bound could be as well improved by computer verification down to 2.9n or less.

This work has been done during the second author’s stay at Institut Gaspard-Monge.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Apostolico, A., Preparata, F.: Optimal off-line detection of repetitions in a string. Theoret. Comput. Sci. 22(3), 297–315 (1983)

    Article  MATH  MathSciNet  Google Scholar 

  2. Crochemore, M.: An optimal algorithm for computing the repetitions in a word. Inform. Proc. Letters 12, 244–250 (1981)

    Article  MATH  MathSciNet  Google Scholar 

  3. Crochemore, M., Ilie, L.: A simple proof that the number of runs in a word is linear (manuscript, 2006)

    Google Scholar 

  4. Crochemore, M., Rytter, W.: Squares, cubes, and time-space efficient string searching. Algorithmica 13, 405–425 (1995)

    Article  MATH  MathSciNet  Google Scholar 

  5. Fraenkel, A.S., Simpson, J.: How many squares can a string contain? J. Combin. Theory, Ser. A 82, 112–120 (1998)

    Article  MATH  MathSciNet  Google Scholar 

  6. Franek, F., Simpson, R.J., Smyth, W.F.: The maximum number of runs in a string. In: Miller, M., Park, K. (eds.) Proc. 14th Australasian Workshop on Combinatorial Algorithms, pp. 26–35 (2003)

    Google Scholar 

  7. Ilie, L.: A simple proof that a word of length n has at most 2n distinct squares. J. Combin. Theory, Ser. A 112(1), 163–164 (2005)

    Article  MATH  MathSciNet  Google Scholar 

  8. Ilie, L.: A note on the number of squares in a word. Theoret. Comput. Sci. 380(3), 373–376 (2007)

    Article  MATH  MathSciNet  Google Scholar 

  9. Iliopoulos, C.S., Moore, D., Smyth, W.F.: A characterization of the squares in a Fibonacci string. Theoret. Comput. Sci. 172, 281–291 (1997)

    Article  MATH  MathSciNet  Google Scholar 

  10. Kolpakov, R., Kucherov, G.: On the sum of exponents of maximal repetitions in a word, Tech. Report 99-R-034, LORIA (1999)

    Google Scholar 

  11. Kolpakov, R., Kucherov, G.: Finding maximal repetitions in a word in linear time. In: Proc. of FOCS 1999, pp. 596–604. IEEE Computer Society Press, Los Alamitos (1999)

    Google Scholar 

  12. Kolpakov, R., Kucherov, G.: On maximal repetitions in words. J. Discrete Algorithms 1(1), 159–186 (2000)

    MathSciNet  Google Scholar 

  13. Lothaire, M.: Algebraic Combinatorics on Words. Cambridge Univ. Press, Cambridge (2002)

    MATH  Google Scholar 

  14. Lothaire, M.: Applied Combinatorics on Words. Cambridge Univ. Press, Cambridge (2005)

    MATH  Google Scholar 

  15. Main, M.G.: Detecting lefmost maximal periodicities. Discrete Applied Math. 25, 145–153 (1989)

    Article  MATH  MathSciNet  Google Scholar 

  16. Puglisi, S.J., Simpson, J., Smyth, B.: How many runs can a string contain? (submitted, 2006)

    Google Scholar 

  17. Rytter, W.: The number of runs in a string: improved analysis of the linear upper bound. In: Durand, B., Thomas, W. (eds.) STACS 2006. LNCS, vol. 3884, pp. 184–195. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  18. Rytter, W.: The number of runs in a string (submitted, 2006)

    Google Scholar 

  19. Stoye, J., Gusfield, D.: Simple and flexible detection of contiguous repeats using a suffix tree. In: Farach-Colton, M. (ed.) CPM 1998. LNCS, vol. 1448, pp. 140–152. Springer, Heidelberg (1998)

    Chapter  Google Scholar 

  20. Thue, A., Zeichenreihen, Ü.u.: Kra. Vidensk. Selsk. Skrifter. I. Mat.-Nat. Kl. Cristiana 7 (1906)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Luděk Kučera Antonín Kučera

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Crochemore, M., Ilie, L. (2007). Analysis of Maximal Repetitions in Strings. In: Kučera, L., Kučera, A. (eds) Mathematical Foundations of Computer Science 2007. MFCS 2007. Lecture Notes in Computer Science, vol 4708. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74456-6_42

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-74456-6_42

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-74455-9

  • Online ISBN: 978-3-540-74456-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics