Skip to main content

On Suffix Tree Detection

  • Conference paper
  • First Online:
String Processing and Information Retrieval (SPIRE 2023)

Abstract

A suffix tree is a fundamental data structure for string processing and information retrieval, however, its structure is still not well understood. The suffix trees reverse engineering problem, which its research aims at reducing this gap, is the following. Given an ordered rooted tree T with unlabeled edges, determine whether there exists a string w such that the unlabeled-edges suffix tree of w is isomorphic to T. Previous studies on this problem consider the relaxation of having the suffix links as well as assume a binary alphabet. This paper is the first to consider the suffix tree detection problem, in which the relaxation of having suffix links as input is removed. We study suffix tree detection on two scenarios that are interesting per se. We provide a suffix tree detection algorithm for general alphabet periodic strings. Given an ordered tree T with n leaves, our detection algorithm takes \(O(n+|\varSigma |^p)\)-time, where p is the unknown in advance length of a period that repeats at least 3 times in a string S having a suffix tree structure identical to T, if such S exists. Therefore, it is a polynomial time algorithm if p is a constant and a linear time algorithm if, in addition, the alphabet has a sub-linear size. We also show some necessary (but insufficient) conditions for binary alphabet general strings suffix tree detection. By this we take another step towards understanding suffix trees structure.

Partly supported by ISF grant 1475/18 and BSF grant 2018141.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 74.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Though Sect. 3 studies periodic string having period length at least 2, which we call non-trivial periodic strings, this complexity is also valid for detection of trivial periodic strings (i.e., unary strings), as follows from condition 2 of Theorem 2 in Sect. 4.

  2. 2.

    Assuming the basic terms: rooted trees, internal tree nodes and tree leaves are known.

  3. 3.

    Note that the structure of a unary string, which is a trivial periodic string with period length 1, is characterized by Theorem 2.2 in Sect. 4.

  4. 4.

    p is unique by condition 2 of Theorem 1.

  5. 5.

    The characters \(\sigma \in \varSigma \) are chosen to the edges of T’s root from left to right according to the order of \(\varSigma \), where the first edge is \(\$\).

References

  1. Amir, A., Eisenberg, E., Levy, A., Porat, E., Shapira, N.: Cycle detection and correction. ACM Trans. Algorithms 9(1), 1–20 (2012). https://doi.org/10.1145/2390176.2390189

    Article  MathSciNet  MATH  Google Scholar 

  2. Amir, A., Levy, A., Lewenstein, M., Lubin, R., Porat, B.: Can we recover the cover? Algorithmica 81(7), 2857–2875 (2019). https://doi.org/10.1007/s00453-019-00559-8

    Article  MathSciNet  MATH  Google Scholar 

  3. Apostolico, A.: The myriad virtues of subword trees. In: Apostolico, A., Galil, Z. (eds.) Combinatorial Algorithms on Words. NATO ASI Series, vol. 12, pp. 85–96. Springer, Heidelberg (1985). https://doi.org/10.1007/978-3-642-82456-2_6

    Chapter  Google Scholar 

  4. Bannai, H., Inenaga, S., Shinohara, A., Takeda, M.: Inferring strings from graphs and arrays. In: Rovan, B., Vojtáš, P. (eds.) MFCS 2003. LNCS, vol. 2747, pp. 208–217. Springer, Heidelberg (2003). https://doi.org/10.1007/978-3-540-45138-9_15

    Chapter  Google Scholar 

  5. Breslauer, D., Italiano, G.F.: On suffix extensions in suffix trees. Theoret. Comput. Sci. 457, 27–34 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  6. Cazaux, B., Rivals, E.: Reverse engineering of compact suffix trees and links: a novel algorithm. J. Discrete Algorithms 28, 9–22 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  7. Clément, J., Crochemore, M., Rindone, G.: Reverse engineering prefix tables. In: Proceedings of the 26th International Symposium on Theoretical Aspects of Computer Science STACS. LIPIcs, vol. 3, pp. 289–300. Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik, Germany (2009)

    Google Scholar 

  8. Crochemore, M., Iliopoulos, C.S., Pissis, S.P., Tischler, G.: Cover array string reconstruction. In: Amir, A., Parida, L. (eds.) CPM 2010. LNCS, vol. 6129, pp. 251–259. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-13509-5_23

    Chapter  Google Scholar 

  9. Duval, J., Lecroq, T., Lefebvre, A.: Efficient validation and construction of border arrays and validation of string matching automata. RAIRO Theor. Inform. Appl. 43(2), 281–297 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  10. Farach, M.: Optimal suffix tree construction with large alphabets. In: Proceedings of the 38th IEEE Symposium on Foundations of Computer Science, pp. 137–143 (1997)

    Google Scholar 

  11. Franek, F., et al.: Verifying a border array in linear time. J. Comb. Math. Comb. Comput. 42, 223–236 (2000)

    MathSciNet  MATH  Google Scholar 

  12. Gawrychowski, P., Jez, A., Jez, L.: Validating the Knuth-Morris-Pratt failure function, fast and online. Theory Comput. Syst. 54(2), 337–372 (2014)

    Article  MATH  Google Scholar 

  13. Gawrychowski, P., Kociumaka, T., Radoszewski, J., Rytter, W., Walen, T.: Universal reconstruction of a string. Theor. Comput. Sci. 812, 174–186 (2020)

    Article  MathSciNet  MATH  Google Scholar 

  14. Gelle, K., Iván, S.: Recognizing union-find trees is NP-complete, even without rank info. Int. J. Found. Comput. Sci. 30(6–7), 1029–1045 (2019)

    Article  MathSciNet  MATH  Google Scholar 

  15. Gusfield, D.: Algorithms on Strings, Trees, and Sequences: Computer Science and Computational Biology. Cambridge University Press, Cambridge (1997)

    Book  MATH  Google Scholar 

  16. He, M., Munro, J.I., Rao, S.S.: A categorization theorem on suffix arrays with applications to space efficient text indexes. In: SODA, vol. 5, pp. 23–32. Citeseer (2005)

    Google Scholar 

  17. Tomohiro, I., Inenaga, S., Bannai, H., Takeda, M.: Verifying and enumerating parameterized border arrays. Theor. Comput. Sci. 412(50), 6959–6981 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  18. Tomohiro, I., Inenaga, S., Bannai, H., Takeda, M.: Inferring strings from suffix trees and links on a binary alphabet. Discrete Appl. Math. 163, 316–325 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  19. Kärkkäinen, J., Piatkowski, M., Puglisi, S.J.: String inference from longest-common-prefix array. In: Proceedings of the 44th International Colloquium on Automata, Languages, and Programming, ICALP. LIPIcs, vol. 80, pp. 62:1–62:14. Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2017)

    Google Scholar 

  20. Kucherov, G., Tóthmérész, L., Vialette, S.: On the combinatorics of suffix arrays. Inf. Process. Lett. 113(22), 915–920 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  21. McCreight, E.M.: A space-economical suffix tree construction algorithm. J. ACM 23, 262–272 (1976)

    Article  MathSciNet  MATH  Google Scholar 

  22. Nakashima, Y., Okabe, T., Tomohiro, I., Inenaga, S., Bannai, H., Takeda, M.: Inferring strings from Lyndon factorization. Theor. Comput. Sci. 689, 147–156 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  23. Schürmann, K.B., Stoye, J.: Counting suffix arrays and strings. Theoret. Comput. Sci. 395(2–3), 220–234 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  24. Starikovskaya, T., Vildhøj, H.W.: A suffix tree or not a suffix tree? J. Discrete Algorithms 32, 14–23 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  25. Ukkonen, E.: On-line construction of suffix trees. Algorithmica 14, 249–260 (1995)

    Article  MathSciNet  MATH  Google Scholar 

  26. Weiner, P.: Linear pattern matching algorithm. In: Proceedings of the 14 IEEE Symposium on Switching and Automata Theory, pp. 1–11 (1973)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Avivit Levy .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Amir, A., Kondratovsky, E., Levy, A. (2023). On Suffix Tree Detection. In: Nardini, F.M., Pisanti, N., Venturini, R. (eds) String Processing and Information Retrieval. SPIRE 2023. Lecture Notes in Computer Science, vol 14240. Springer, Cham. https://doi.org/10.1007/978-3-031-43980-3_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-43980-3_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-43979-7

  • Online ISBN: 978-3-031-43980-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics