Suffix trees and string complexity

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 658)


Let s = (s 1, s 2, . . ., s n) be a sequence of characters where s iZ p for 1 ≤ in. One measure of the complexity of the sequence s is the length of the shortest feedback shift register that will generate s, which is known as the maximum order complexity of s [17, 18]. We provide a proof that the expected length of the shortest feedback register to generate a sequence of length n is less than 2 logp n + o(1), and also give several other statistics of interest for distinguishing random strings. The proof is based on relating the maximum order complexity to a data structure known as a suffix tree.


Binary Sequence Finite State Machine Linear Span Shift Register Stream Cipher 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. [1]
    A. V. Aho, J. E. Hopcroft, and J. D. Ullman. The Design and Analysis of Computer Algorithms. Addison-Wesley Publishing Company, 1974.Google Scholar
  2. [2]
    A. Apostolico and W. Szpankowski. Self-alignments in words and their applications. Technical Report CDS-TR-732, Purdue University, 1987.Google Scholar
  3. [3]
    R. Arratia, L. Gordon, and M. Waterman. An extreme value theory for sequence matching. The Annals of Statistics, 14(3):971–993, 1986.zbMATHCrossRefMathSciNetGoogle Scholar
  4. [4]
    H. Beker and F. Piper. Cipher Systems. Wiley, 1982.Google Scholar
  5. [5]
    A. Blumer, J. Blumer, A. Ehrenfeucht, D. Haussler, and R McConnell. Linear size finite automata for the set of all subwords of a word: outline of results. Bulletin of the European Association of Theoretical Computer Science, 21:12–20, 1983.Google Scholar
  6. [6]
    A. Blumer, E. Ehrenfeucht, and D. Haussler. Average sizes of suffix trees and DAWGs. Discrete Applied Mathemetics, 24:37–45, 1989.zbMATHCrossRefMathSciNetGoogle Scholar
  7. [7]
    A. Chan and R. Games. On the quadratic spans of periodic sequences. IEEE Transactions on Information Theory, IT-36(4):822–829, 1990.CrossRefMathSciNetGoogle Scholar
  8. [8]
    N. G. de Bruijn. A combinatorial problem. Nederl. Akad. Wetensch. Proc, 49:754–758, 1946.Google Scholar
  9. [9]
    D. E. R. Denning. Cryptography and Data Security. Addison-Wesley Publishing Company, 1982.Google Scholar
  10. [10]
    L. Devroye. A probabilistic analysis of the height of tries and of the complexity if triesort. Acta Informatica, 21:229–232, 1984.zbMATHCrossRefMathSciNetGoogle Scholar
  11. [11]
    S. Golumb. Shift Register Sequences. Aegean Park Press, 1982.Google Scholar
  12. [12]
    G. H. Gonnet and R. Baeza-Yates. Handbook of Algorithms and Data Structures. Addison-Wesley, Second Edition, 1991.Google Scholar
  13. [13]
    E. J. Growth. Generation of binary sequences with controllable complexity. IEEE Transactions on Information Theory, 17(3):288–296, 1971.CrossRefGoogle Scholar
  14. [14]
    H. Gustafson, E. Dawson, and W. Caellie. Comparison of block ciphers. Advances in Cryptology, AUSTCRYPT 90, Lecture Notes in Computer Science, vol. 453, J. Seberry and J. Piepryzk eds., Springer-Verlag, pages 208–220, 1990.CrossRefGoogle Scholar
  15. [15]
    J. Hopcroft and J. Ullman. An introduction to automata, languages and computation. Addison-Wesley Publishing Company, 1979.Google Scholar
  16. [16]
    P. Jacquet and W. Szpankowski. Autocorrelation on words and its applications: analysis of suffix trees by string-ruler approach. preprint, 1990.Google Scholar
  17. [17]
    C. J. A. Jansen. Investigations on Nonlinear Streamcipher systems: Construction and Evaluation methods. PhD thesis, Philips, USFA BV, 1989.Google Scholar
  18. [18]
    C. J. A. Jansen and D. Boekee. The shortest feedbck shift register that can generate a sequence. Advances in Cryptology, CRYPTO 89, Lecture Notes in Computer Science, vol. 218, G. Brassard, ed., Springer-Verlag, pages 90–99, 1990.CrossRefGoogle Scholar
  19. [19]
    E. L. Key. An analysis of the structure and complexity of nonlinear binary sequence generators. IEEE Transactions on Information Theory, 22(6):732–736, 1976.zbMATHCrossRefMathSciNetGoogle Scholar
  20. [20]
    D. E. Knuth. The Art of Computer Programming: Volume 3, Sorting and Searching. Addsion Wesley, 1973.Google Scholar
  21. [21]
    D. E. Knuth. The Art of Computer Programming: Volume 2, Seminumerical Algorithms. Addsion Wesley, 1981.Google Scholar
  22. [22]
    A. N. Kolmogorov. Three approaches to the quantitative definition of definition. Problems in Information Transmission, 1(1):1–7, 1965.MathSciNetGoogle Scholar
  23. [23]
    M. Li and P. M. B. Vitanyi. Two decades of applied Kolmogorov complexity. Technical Report CS-R8813, Centre for Mathematics and Computer Science, April, 1988.Google Scholar
  24. [24]
    J. L. Massey. Shift-register synthesis and BCH decoding. IEEE Transactions on Information Theory, 15:122–127, 1969.zbMATHCrossRefMathSciNetGoogle Scholar
  25. [25]
    U. M Maurer. Asymptotically tight bounds on the number of cycles in generalized de Bruijn-Good graphs. to appear in Discrete Applied Mathematics.Google Scholar
  26. [26]
    E. M. McCreight. A space-economical suffix tree construction algorithm. Journal of the ACM, 23(2):262–272, 1976.zbMATHCrossRefMathSciNetGoogle Scholar
  27. [27]
    M. Regnier. On the average height of trees in digital search and dynamic hashing. Information Processing Letters, 13(2):64–66, 1981.zbMATHCrossRefMathSciNetGoogle Scholar
  28. [28]
    R. A. Rueppel. Design and Analysis of Stream Ciphers. Springer-Verlag, 1986.Google Scholar
  29. [29]
    J. Savage. The Complexity of Computing. John Wiley, 1976.Google Scholar
  30. [30]
    W. Szpankowski. On the analysis of the average height of a digital search tree: another approach. Technical Report CSD-TR-646, Purdue University, 1986.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1993

Authors and Affiliations

  1. 1.Department of Computer ScienceUniversity of WaterlooCanada

Personalised recommendations