Skip to main content

What can we learn about suffix trees from independent tries?

  • Conference paper
  • First Online:
  • 1998 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 519))

Abstract

A suffix tree of a word is a digital tree that is built from suffixes of the underlying word. We consider words that are random sequences built from independent symbols over a finite alphabet. Our main finding shows that the depths in a suffix tree are asymptotically equivalent to the depths in a digital tree that stores independent keys (i.e., independent digital trees known also as tries). More precisely, we prove that the depths in a suffix tree build from the first n suffixes of a random word are normally distributed with the mean asymptotically equivalent to 1/h 1 log n and the variance α·log n, where h 1 is the entropy of the alphabet, and α is a parameter of the probabilistic model. Our results provide new insights into asymptotic properties of compression schemes, and therefore find direct applications in computer sciences and telecommunications, most notably in coding theory, theory of languages, and design and analysis of algorithms.

This research was primary supported by NATO Collaborative Grant 0057/89.

This research was primary done while the author was visiting INRIA in Rocquencourt, France. Support was provided in part by NATO Collaborative Grant 0057/89, in part by NSF Grants NCR-8702115 and CCR-8900305, and from Grant AFOSR-90-0107, and in part by Grant R01 LM05118 from the National Library of Medicine.

This is a preview of subscription content, log in via an institution.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. A. Apostolico, The Myriad Virtues of Suffix Trees, Combinatorial Algorithms on Words, pp. 8596, Springer-Verlag, ASI F12 (1985).

    Google Scholar 

  2. A.V. Aho, J.E. Hopcroft and J.D. Ullman, The Design and Analysis of Computer Algorithms, Addison-Wesley (1974).

    Google Scholar 

  3. A. Apostolico, W. Szpankowski, Self-alignments in Words and Their Applications, Purdue CSD-TR-732 (1987); Journal of Algorithms, to appear.

    Google Scholar 

  4. A. Blumer, A. Ehrenfeucht and D. Haussler, Average Size of Suffix Trees and DAWGS, Discrete Applied Mathematics, 24, 37–45 (1989).

    Google Scholar 

  5. B. Bollobás Random Graphs, Academic Press, London (1985).

    Google Scholar 

  6. L. Devroye, A Note on the Average Depth of Tries, Computing, 28, 367–371 (1982).

    Google Scholar 

  7. L., Devroye, W. Szpankowski and B. Rais, A note of the height of suffix trees, Purdue University, CSD TR-905 (1989); SIAM J. Computing, to appear.

    Google Scholar 

  8. P. Flajolet, On the Performance Evaluation of Extendible Hashing and Trie Searching, Acta Informatica, 20, 345369 (1983).

    Google Scholar 

  9. P. Flajolet, M. Regnier and R. Sedgewick, Some Uses of the Mellin Transform Techniques in the Analysis of Algorithms, in Combinatorial Algorithms on Words, Springer NATO ASI Ser. F12, 241–254 (1985).

    Google Scholar 

  10. L. Guibas and A. Odlyzko Maximal Prefix-Synchronized Codes, SIAM J. Appl. Math, 35, 401–418 (1978).

    Google Scholar 

  11. L. Giubas and A. Odlyzko, Periods in Strings Journal of Combinatorial Theory, Series A, 30, 19–43 (1981).

    Google Scholar 

  12. L. Guibas and A. W. Odlyzko, String Overlaps, Pattern Matching, and Nontransitive Games, Journal of Combinatorial Theory, Series A, 30, 183–208 (1981).

    Google Scholar 

  13. P. Henrici, Applied and Computational Complex Analysis, John Wiley & Sons (1977).

    Google Scholar 

  14. P. Jacquet and M. Regnier, Trie Partitioning Process: Limiting Distribution, Proc. CAAP'86, Lecture Notes in Computer Science 214, 194–210 (1986).

    Google Scholar 

  15. P. Jacquet and W. Szpankowski, Analysis of Tries With Markovian Dependency, Purdue University, CSD TR-906, 1989; IEEE Trans. Information Theory, to appear.

    Google Scholar 

  16. P. Jacquet and W. Szpankowski, Autocorrelation on Words and Its Applications. Analysis of Suffix Trees by String-Ruler Approach, INRIA TR-1106, 1989.

    Google Scholar 

  17. D. Knuth, The Art of Computer Programming. Sorting and Searching, Addison-Wesley (1973).

    Google Scholar 

  18. M. Lothaire, Combinatorics on Words, Addison-Wesley (1982).

    Google Scholar 

  19. A. Lempel and J. Ziv, On the Complexity of Finite Sequences, IEEE Information Theory 22, 1, 75–81 (1976).

    Google Scholar 

  20. E.M. McCreight, A Space Economical Suffix Tree Construction Algorithm, JACM, 23, 262272 (1976).

    Google Scholar 

  21. B. Pittel, Asymptotic growth of a class of random trees, The Annals of Probability, 18, 414–427 (1985).

    Google Scholar 

  22. B. Pittel, Paths in a Random Digital Tree: Limiting Distributions, Adv. Appl. Prob., 18, 139–155 (1986).

    Google Scholar 

  23. M. Regnier and P. Jacquet, New Results on the Size of Tries, IEEE Trans. Information Theory, 35, 203–205 (1989).

    Google Scholar 

  24. M. Rodeh, V. Pratt and S. Even, Linear Algorithm for Data Compression via String Matching, Journal of the ACM, 28, 16–24 (1981).

    Google Scholar 

  25. W. Szpankowski, Some Results on V-ary Asymmetric Tries, Journal of Algorithms, 9, 224–244 (1988).

    Google Scholar 

  26. W. Szpankowski, The Evaluation of an Alternating Sum with Applications to the Analysis of Some Data Structures, Information Processing Letters, 28, 13–19 (1988).

    Google Scholar 

  27. W. Szpankowski, On the Height of Digital Trees and Related Problems, Algorithmica, 6, 256–277 (1991).

    Google Scholar 

  28. P. Weiner, Linear Pattern Matching Algorithms, Proc. of the 14-th Annual Symposium on Switching and Automata Theory, 111 (1973).

    Google Scholar 

  29. J. Ziv and A. Lempel, A Universal Algorithm for Sequential Data Compression, IEEE Information Theory, 23, 3, 337–343 (1977).

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Frank Dehne Jörg-Rüdiger Sack Nicola Santoro

Rights and permissions

Reprints and permissions

Copyright information

© 1991 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Jacquet, P., Szpankowski, W. (1991). What can we learn about suffix trees from independent tries?. In: Dehne, F., Sack, JR., Santoro, N. (eds) Algorithms and Data Structures. WADS 1991. Lecture Notes in Computer Science, vol 519. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0028265

Download citation

  • DOI: https://doi.org/10.1007/BFb0028265

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-54343-5

  • Online ISBN: 978-3-540-47566-8

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics