Data structures and algorithms for the string statistics problem

Abstract

Given a textstringx of lengthn, theMinimal Augmented Suffix Tree T (x) ofx is a digital-search index that returns, for anyquery stringw and in a number of comparisons bounded by the length ofw, the maximum number of nonoverlapping occurrences ofw inx. It is shown that, denoting the length ofx byn, T(x) can be built in timeO(n log2 n) and spaceO(n logn), off-line on a RAM.

This is a preview of subscription content, access via your institution.

References

  1. [AHU]

    A. V. Aho, J. E. Hopcroft, and J. D. Ullman,The Design and Analysis of Computer Algorithms, Addison-Wesley, Reading, MA, 1974.

    Google Scholar 

  2. [A]

    A. Apostolico, The myriad virtues of subword trees, inCombinatorial Algorithms on Words (A. Apostolico and Z. Galil, eds.), ASI F-12, Springer-Verlag, New York, pp. 85–95, 1985.

    Google Scholar 

  3. [AP1]

    A. Apostolico and F. P. Preparata, Optimal off-line detection of repetitions in a string,Theoret. Comput. Sci.,22 (1983), pp. 297–515.

    Google Scholar 

  4. [AP2]

    A. Apostolico and F. P. Preparata, Structural properties of the string statistics problem,J. Comput. System Sci.,31(3) (1985), 394–411.

    Google Scholar 

  5. [CR]

    M. Crochemore and W. Rytter, Squares, cubes, and time-space efficient string searching,Algorithmica,13 (1995), 405–425.

    Google Scholar 

  6. [LS]

    R. C. Lyndon and M. P. Schutzenberger, The equationa M =b N c P in a free group,Michigan Math. J.,9 (1962), 289–298.

    Google Scholar 

  7. [M]

    E. M. McCreight, A space economical suffix tree construction algorithm,J. Assoc. Comput. Mach.,25 (1976), 262–272.

    Google Scholar 

Download references

Author information

Affiliations

Authors

Additional information

This research was supported in part, through the Leonardo Fibonacci Institute, by the Istituto Trentino di Cultura, Trento, Italy.

Additional support was provided by NSF Grants CCR-8900305 and CCR-9201078, by NATO Grant CRG 900293, by the National Research Council of Italy, and by the ESPRIT III Basic Research Programme of the EC under Contract No. 9072 (Project GEPPCOM).

Additional support was provided by NSF Grant CCR-91-96176 and ONR Contract N 00014-91-J-4052, ARPA Order 2225.

Communicated by C. K. Wong.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Apostolico, A., Preparata, F.P. Data structures and algorithms for the string statistics problem. Algorithmica 15, 481–494 (1996). https://doi.org/10.1007/BF01955046

Download citation

Key words

  • Design and analysis of algorithms
  • Combinatorics on strings
  • Pattern matching
  • Substring statistics
  • Suffix tree
  • Augmented suffix tree
  • Period of a string
  • Repetition in a string