Finding Maximal Quasiperiodicities in Strings

  • Gerth St/øting Brodal
  • Christian N. S. Pedersen
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1848)

Abstract

Apostolico and Ehrenfeucht defined the notion of a maximal quasiperiodic substring and gave an algorithm that finds all maximal quasiperiodic substrings in a string of length n in time O(n log2 n). In this paper we give an algorithm that finds all maximal quasiperiodic substrings in a string of length n in time O(n log n) and space O(n). Our algorithm uses the suffix tree as the fundamental data structure combined with efficient methods for merging and performing multiple searches in search trees. Besides finding all maximal quasiperiodic substrings, our algorithm also marks the nodes in the suffix tree that have a superprimitive path-label.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    G. M. Adel’son-Vel’skii and Y. M. Landis. An algorithm for the organization of information. Doklady Akademii Nauk SSSR, 146:263–266, 1962. English translation in Soviet Math. Dokl., 3:1259–1262.MathSciNetGoogle Scholar
  2. 2.
    A. V. Aho, J. E. Hopcraft, and J. D. Ullman. The Design and Analysis of Computer Algorithms. Addison-Wesley, Reading, Massachusetts, 1974.MATHGoogle Scholar
  3. 3.
    A. Apostolico and D. Breslauer. Of periods, quasiperiods, repetitions and covers. In A selection of essays in honor of A. Ehrenfeucht, volume 1261 of Lecture Notes in Computer Science. Springer, 1997.Google Scholar
  4. 4.
    A. Apostolico and A. Ehrenfeucht. Efficient detection of quasiperiodicities in strings. Theoretical Computer Science, 119:247–265, 1993.MATHCrossRefMathSciNetGoogle Scholar
  5. 5.
    A. Apostolico, M. Farach, and C. S. Iliopoulos. Optimal superprimitivity testing for strings. Information Processing Letters, 39:17–20, 1991.MATHCrossRefMathSciNetGoogle Scholar
  6. 6.
    A. Apostolico and F. P. Preparata. Optimal off-line detection of repetitions in a string. Theoretical Computer Science, 22:297–315, 1983.MATHCrossRefMathSciNetGoogle Scholar
  7. 7.
    D. Breslauer. An on-line string superprimitivity test. Information Processing Letters, 44:345–347, 1992.MATHCrossRefMathSciNetGoogle Scholar
  8. 8.
    G. S. Brodal, R. B. Lyngsø, C. N. S. Pedersen, and J. Stoye. Finding maximal pairs with bounded gap. In Proceedings of the 10th Annual Symposium on Combinatorial Pattern Matching (CPM), volume 1645 of Lecture Notes in Computer Science, pages 134–149, 1999.CrossRefGoogle Scholar
  9. 9.
    G. S. Brodal and C. N. S. Pedersen. Finding maximal quasiperiodicities in strings. Technical Report RS-99-25, BRICS, September 1999.Google Scholar
  10. 10.
    M. R. Brown and R. E. Tarjan. A fast merging algorithm. Journal of the ACM, 26(2):211–226, 1979.MATHCrossRefMathSciNetGoogle Scholar
  11. 11.
    M. Crochemore. An optimal algorithm for computing the repetitions in a word. Information Processing Letters, 12(5):244–250, 1981.MATHCrossRefMathSciNetGoogle Scholar
  12. 12.
    M. Crochemore. Transducers and repetitions. Theoretical Computer Science, 45:63–86, 1986.MATHCrossRefMathSciNetGoogle Scholar
  13. 13.
    D. Gusfield. Algorithms on Strings, Trees and Sequences: Computer Science and Computational Biology. Cambridge University Press, 1997.Google Scholar
  14. 14.
    D. Gusfield and J. Stoye. Linear time algorithms for finding and representing all the tandem repeats in a string. Technical Report CSE-98-4, Department of Computer Science, UC Davis, 1998.Google Scholar
  15. 15.
    C. S. Iliopoulos and L. Mouchard. Quasiperiodicity: From detection to normal forms. Journal of Automata, Languages and Combinatorics, 4(3):213–228, 1999.MATHMathSciNetGoogle Scholar
  16. 16.
    R. Kolpakov and G. Kucherov. Maximal repetitions in words or how to find all squares in linear time. Technical Report 98-R-227, LORIA, 1998.Google Scholar
  17. 17.
    M. G. Main and R. J. Lorentz. An O(n log n) algorithm for finding all repetitions in a string. Journal of Algorithms, 5:422–432, 1984.MATHCrossRefMathSciNetGoogle Scholar
  18. 18.
    M. G. Main and R. J. Lorentz. Linear time recognition of squarefree strings. In A. Apostolico and Z. Galil, editors, Combinatorial Algorithms on Words, volume F12 of NATO ASI Series, pages 271–278. Springer, Berlin, 1985.Google Scholar
  19. 19.
    E. M. McCreight. A space-economical suffix tree construction algorithm. Journal of the ACM, 23(2):262–272, 1976.MATHCrossRefMathSciNetGoogle Scholar
  20. 20.
    K. Mehlhorn. Sorting and Searching, volume 1 of Data Structures and Algorithms. Springer-Verlag, 1994.Google Scholar
  21. 21.
    Mehlhorn and S. Näher. The LEDA Platform of Combinatorial and Geometric Computing. Cambridge University Press, 1999.Google Scholar
  22. 22.
    D. Moore and W. F. Smyth. Computing the covers of a string in linear time. In Proceedings of the 5th Annual Symposium on Discrete Algorithms (SODA), pages 511–515, 1994.Google Scholar
  23. 23.
    M. Rabin. Discovering repetitions in strings. In A. Apostolico and Z. Galil, editors, Combinatorial Algorithms on Words, volume F12 of NATO ASI Series, pages 279–288. Springer, Berlin, 1985.Google Scholar
  24. 24.
    J. Stoye and D. Gusfield. Simple and flexible detection of contiguous repeats using a suffix tree. In Proceedings of the 9th Annual Symposium on Combinatorial Pattern Matching (CPM), volume 1448 of Lecture Notes in Computer Science, pages 140–152, 1998.CrossRefGoogle Scholar
  25. 25.
    A. Thue. Uber unendliche Zeichenreihen. Skrifter udgivet af Videnskabsselskabet i Christiania, Mathematiskog Naturvidenskabeligklasse, 7:1–22, 1906.Google Scholar
  26. 26.
    A. Thue. Uber die gegenseitige Lage gleicher Teile gewisser Zeichenreihen. Skrifter udgivet af Videnskabsselskabet i Christiania, Mathematiskog Naturvidenskabeligklasse, 1:1–67, 1912.Google Scholar
  27. 27.
    E. Ukkonen. On-line construction of suffix trees. Algorithmica, 14:249–260, 1995.MATHCrossRefMathSciNetGoogle Scholar
  28. 28.
    P. Weiner. Linear pattern matching algorithms. In Proceedings of the 14th Symposium on Switching and Automata Theory, pages 1–11, 1973.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2000

Authors and Affiliations

  • Gerth St/øting Brodal
    • 1
  • Christian N. S. Pedersen
    • 1
  1. 1.Basic Research in Computer Science (BRICS), Centre of the Danish National Research Foundation, Department of Computer ScienceUniversity of AarhusÅrhus CDenmark

Personalised recommendations