On-line construction of suffix trees

Abstract

An on-line algorithm is presented for constructing the suffix tree for a given string in time linear in the length of the string. The new algorithm has the desirable property of processing the string symbol by symbol from left to right. It always has the suffix tree for the scanned part of the string ready. The method is developed as a linear-time version of a very simple algorithm for (quadratic size) suffixtries. Regardless of its quadratic worst case this latter algorithm can be a good practical method when the string is not too long. Another variation of this method is shown to give, in a natural way, the well-known algorithms for constructing suffix automata (DAWGs).

This is a preview of subscription content, log in to check access.

References

  1. [1]

    A. Aho and M. Corasick, Efficient string matching: an aid to bibliographic search,Comm. ACM,18 (1975), 333–340.

    Google Scholar 

  2. [2]

    A. Amir and M.Farach, Adaptive dictionary matching,Proc. 32nd IEEE Ann. Symp. on Foundations of Computer Science, 1991, pp. 760–766.

  3. [3]

    A. Apostolico, The myriad virtues of subword trees, inCombinatorial Algorithms on Words (A. Apostolico and Z. Galil, eds.), Springer-Verlag, New York, 1985, pp. 85–95.

    Google Scholar 

  4. [4]

    A. Blumer, J. Blumer, D. Haussier, A. Ehrenfeucht, M. T. Chen, and J. Seiferas, The smallest automaton recognizing the subwords of a text,Theoret. Comput. Sci.,40 (1985), 31–55.

    Google Scholar 

  5. [5]

    M. Crochemore, Transducers and repetitions,Theoret. Comput. Sci.,45 (1986), 63–86.

    Google Scholar 

  6. [6]

    M. Crochemore, String matching with constraints, inMathematical Foundations of Computer Science 1988 (M. P. Chytil, L. Janiga and V. Koubek, eds.), Lecture Notes in Computer Science, vol. 324, Springer-Verlag, Berlin, 1988, pp. 44–58.

    Google Scholar 

  7. [7]

    Z. Galil and R. Giancarlo, Data structures and algorithms for approximate string matching,J. Complexity,4 (1988), 33–72.

    Google Scholar 

  8. [8]

    M. Kempf, R. Bayer, and U. Güntzer, Time optimal left to right construction of position trees,Acta Inform.,24 (1987), 461–474.

    Google Scholar 

  9. [9]

    E. McCreight, A space-economical suffix tree construction algorithm,J. Assoc. Comput. Mach.,23 (1976), 262–272.

    Google Scholar 

  10. [10]

    E. Ukkonen, Constructing suffix trees on-line in linear time, inAlgorithms, Software, Architecture. Information Processing 92, vol. I (J. van Leeuwen, ed.), Elsevier, Amsterdam, 1992, pp. 484–492.

    Google Scholar 

  11. [11]

    E. Ukkonen, Approximate string-matching over suffix trees, inCombinatorial Pattern Matching, CPM '93 (A. Apostolico, M. Crochemore, Z. Galil, and U. Manber, eds.), Lecture Notes in Computer Science, vol. 684, Springer-Verlag, Berlin 1993, pp. 228–242.

    Google Scholar 

  12. [12]

    E. Ukkonen and D. Wood, Approximate string matching with suffix automata,Algorithmica,10 (1993), 353–364.

    Google Scholar 

  13. [13]

    P. Weiner, Linear pattern matching algorithms,Proc. IEEE 14th Ann. Symp. on Switching and Automata Theory, 1973, pp. 1–11.

Download references

Author information

Affiliations

Authors

Additional information

This research was supported by the Academy of Finland and by the Alexander von Humboldt Foundation (Germany).

Communicated by K. Mehlhorn.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Ukkonen, E. On-line construction of suffix trees. Algorithmica 14, 249–260 (1995). https://doi.org/10.1007/BF01206331

Download citation

Key words

  • Linear-time algorithm
  • Suffix tree
  • Suffix trie
  • Suffix automaton
  • DAWG