Skip to main content

Searching of Gapped Repeats and Subrepetitions in a Word

  • Conference paper
Combinatorial Pattern Matching (CPM 2014)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8486))

Included in the following conference series:

Abstract

A gapped repeat is a factor of the form uvu where u and v are nonempty words. The period of the gapped repeat is defined as |u| + |v|. The gapped repeat is maximal if it cannot be extended to the left or to the right by at least one letter with preserving its period. The gapped repeat is called α-gapped if its period is not greater than α|u|. A δ-subrepetition is a factor which exponent is less than 2 but is not less than 1 + δ (the exponent of the factor is the quotient of the length and the minimal period of the factor). The δ-subrepetition is maximal if it cannot be extended to the left or to the right by at least one letter with preserving its minimal period. We obtain that in a word of length n the number of maximal α-gapped repeats is bounded by O(α 2 n) and the number of maximal δ-subrepetitions is bounded by O(n/δ 2). Using the obtained upper bounds, we propose algorithms for finding all maximal α-gapped repeats and all maximal δ-subrepetitions in a word of length n. The algorithm for finding all maximal α-gapped repeats has O(α 2 n) time complexity for the case of constant alphabet size and O(nlogn + α 2 n) time complexity for the general case. For finding all maximal δ-subrepetitions we propose two algorithms. The first algorithm has \(O(\frac{n\log\log n}{\delta^2})\) time complexity for the case of constant alphabet size and \(O(n\log n +\frac{n\log\log n}{\delta^2})\) time complexity for the general case. The second algorithm has \(O(n\log n+\frac{n}{\delta^2}\log \frac{1}{\delta})\) expected time complexity.

This work is partially supported by Russian Foundation for Fundamental Research (Grant 12-07-00216).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Brodal, G., Lyngso, R., Pedersen, C., Stoye, J.: Finding Maximal Pairs with Bounded Gap. J. of Discrete Algorithms 1(1), 77–104 (2000)

    MathSciNet  Google Scholar 

  2. Crochemore, M.: An optimal algorithm for computing the repetitions in a word. Information Processing Letters 12, 244–250 (1981)

    Article  MATH  MathSciNet  Google Scholar 

  3. Crochemore, M., Rytter, W.: Squares, cubes, and time-space efficient string searching. Algorithmica 13, 405–425 (1995)

    Article  MATH  MathSciNet  Google Scholar 

  4. Crochemore, M., Hancart, C., Lecroq, T.: Algorithms on Strings. Cambridge University Press (2007)

    Google Scholar 

  5. Crochemore, M., Ilie, L., Tinta, L.: Towards a solution to the “runs” conjecture. In: Ferragina, P., Landau, G.M. (eds.) CPM 2008. LNCS, vol. 5029, pp. 290–302. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  6. Crochemore, M., Iliopoulos, C., Kubica, M., Radoszewski, J., Rytter, W., Waleń, T.: Extracting powers and periods in a string from its runs structure. In: Chavez, E., Lonardi, S. (eds.) SPIRE 2010. LNCS, vol. 6393, pp. 258–269. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  7. van Emde Boas, P., Kaas, R., Zulstra, E.: Design and Implementation of an Efficient Priority Queue. Mathematical Systems Theory 10, 99–127 (1977)

    Article  MATH  Google Scholar 

  8. Galil, Z., Seiferas, J.: Time-space optimal string matching. J. of Computer and System Sciences 26(3), 280–294 (1983)

    Article  MathSciNet  Google Scholar 

  9. Gusfield, D.: Algorithms on Strings, Trees, and Sequences. Cambridge University Press (1997)

    Google Scholar 

  10. Gusfield, D., Stoye, J.: Linear time algorithms for finding and representing all the tandem repeats in a string. J. of Computer and System Sciences 69(4), 525–546 (2004)

    Article  MATH  MathSciNet  Google Scholar 

  11. Kociumaka, T., Radoszewski, J., Rytter, W., Waleń, T.: Efficient Data Structures for the Factor Periodicity Problem. In: Calderón-Benavides, L., González-Caro, C., Chávez, E., Ziviani, N. (eds.) SPIRE 2012. LNCS, vol. 7608, pp. 284–294. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  12. Kolpakov, R., Kucherov, G.: On Maximal Repetitions in Words. J. of Discrete Algorithms 1(1), 159–186 (2000)

    MathSciNet  Google Scholar 

  13. Kolpakov, R., Kucherov, G.: Finding Repeats with Fixed Gap. In: 7th International Symposium on String Processing and Information Retrieval (SPIRE 2000), pp. 162–168 (2000)

    Google Scholar 

  14. Kolpakov, R., Kucherov, G.: Periodic structures in words. Chapter for the 3rd Lothaire volume Applied Combinatorics on Words. Cambridge University Press (2005)

    Google Scholar 

  15. Kolpakov, R., Kucherov, G., Ochem, P.: On maximal repetitions of arbitrary exponent. Information Processing Letters 110(7), 252–256 (2010)

    Article  MATH  MathSciNet  Google Scholar 

  16. Kolpakov, R.: On primary and secondary repetitions in words. Theoretical Computer Science 418, 71–81 (2012)

    Article  MATH  MathSciNet  Google Scholar 

  17. Kolpakov, R., Podolskiy, M., Posypkin, M., Khrapov, N.: Searching of gapped repeats and subrepetitions in a word, http://arxiv.org/abs/1309.4055

  18. Lothaire, M.: Combinatorics on Words. Encyclopedia of Mathematics and Its Applications, vol. 17. Addison-Wesley (1983)

    Google Scholar 

  19. Storer, J.: Data compression: Methods and theory. Computer Science Press, Rockville (1988)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Kolpakov, R., Podolskiy, M., Posypkin, M., Khrapov, N. (2014). Searching of Gapped Repeats and Subrepetitions in a Word. In: Kulikov, A.S., Kuznetsov, S.O., Pevzner, P. (eds) Combinatorial Pattern Matching. CPM 2014. Lecture Notes in Computer Science, vol 8486. Springer, Cham. https://doi.org/10.1007/978-3-319-07566-2_22

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-07566-2_22

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-07565-5

  • Online ISBN: 978-3-319-07566-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics