Skip to main content

Efficient Representation for Online Suffix Tree Construction

  • Conference paper
Experimental Algorithms (SEA 2014)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8504))

Included in the following conference series:

  • 1626 Accesses

Abstract

Suffix tree construction algorithms based on suffix links are popular because they are simple to implement, can operate online in linear time, and because the suffix links are often convenient for pattern matching. We present an approach using edge-oriented suffix links, which reduces the number of branch lookup operations (known to be a bottleneck in construction time) with some additional techniques to reduce construction cost. We discuss various effects of our approach and compare it to previous techniques. An experimental evaluation shows that we are able to reduce construction time to around half that of the original algorithm, and about two thirds that of previously known branch-reduced construction.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Apostolico, A.: The myriad virtues of subword trees. In: Apostolico, A., Galil, Z. (eds.) Combinatorial Algorithms on Words. NATO ASI Series, vol. F 12, pp. 85–96. Springer (1985)

    Google Scholar 

  2. Arbitman, Y., Naor, M., Segev, G.: Backyard cuckoo hashing: Constant worst-case operations with a succinct representation. In: Proc. 51st Ann. IEEE Symp. Foundations of Comput. Sci., pp. 787–796 (2010)

    Google Scholar 

  3. Cánovas, R., Navarro, G.: Practical compressed suffix trees. In: Festa, P. (ed.) SEA 2010. LNCS, vol. 6049, pp. 94–105. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  4. Clark, D.R., Munro, J.I.: Efficient suffix trees on secondary storage. In: Proc. Seventh Ann. ACM–SISM Symp. Discrete Algorithms, pp. 383–391 (1996)

    Google Scholar 

  5. Farach, M.: Optimal suffix tree construction with large alphabets. In: Proc. 38th Ann. IEEE Symp. Foundations of Comput. Sci., pp. 137–143 (October 1997)

    Google Scholar 

  6. Ferragina, P.: Suffix tree construction in hierarchical memory. In: Encyclopedia of Algorithms, pp. 922–925. Springer (2008)

    Google Scholar 

  7. Ferragina, P., Navarro, G.: Pizza & chili corpus (2005), http://pizzachili.dcc.uchile.cl/

  8. Frigo, M., Leiserson, C., Prokop, H., Ramachandran, S.: Cache-oblivious algorithms. In: Proc. 40th Ann. IEEE Symp. Foundations of Comput. Sci., pp. 285–297 (1999)

    Google Scholar 

  9. Giegerich, R., Kurtz, S., Stoye, J.: Efficient implementation of lazy suffix trees. Software – Practice and Experience 33(11), 1035–1049 (2001)

    Article  Google Scholar 

  10. Gusfield, D.: Algorithms on Strings, Trees, and Sequences. Cambridge University Press (1997)

    Google Scholar 

  11. Hagerup, T., Miltersen, P.B., Pagh, R.: Deterministic dictionaries. Journal of Algorithms 41(1), 69–85 (2001)

    Article  MATH  MathSciNet  Google Scholar 

  12. Kiełbasa, S.M., Wan, R., Sato, K., Horton, P., Frith, M.C.: Adaptive seeds tame genomic sequence comparison. Genome Research 21(3), 487–493 (2011)

    Article  Google Scholar 

  13. Kurtz, S.: Reducing the space requirement of suffix trees. Software – Practice and Experience 29(13), 1149–1171 (1999)

    Article  Google Scholar 

  14. Larsson, N.J.: Extended application of suffix trees to data compression. In: Proc. IEEE Data Compression Conf., pp. 190–199 (March-April 1996)

    Google Scholar 

  15. Manber, U., Myers, G.: Suffix arrays: A new method for on-line string searches. J. Comput. 22(5), 935–948 (1993)

    MATH  MathSciNet  Google Scholar 

  16. Manzini, G., Ferragina, P.: Lightweight corpus (2004), http://people.unipmn.it/manzini/lightweight/corpus/

  17. McCreight, E.M.: A space-economical suffix tree construction algorithm. J. ACM 23(2), 262–272 (1976)

    Article  MATH  MathSciNet  Google Scholar 

  18. Perf: Linux profiling with performance counters, https://perf.wiki.kernel.org/

  19. Senft, M., Dvořák, T.: On-line suffix tree construction with reduced branching. Journal of Discrete Algorithms 12(0), 48–60 (2012)

    Google Scholar 

  20. Tian, Y., Tata, S., Hankins, R.A., Patel, J.M.: Practical methods for constructing suffix trees. The VLDB Journal 14(3), 281–289 (2005)

    Article  Google Scholar 

  21. Tsirogiannis, D., Koudas, N.: Suffix tree construction algorithms on modern hardware. In: Proc. 13th International Conference on Extending Database Technology, pp. 263–274 (2010)

    Google Scholar 

  22. Ukkonen, E.: On-line construction of suffix trees. Algorithmica 14(3), 249–260 (1995)

    Article  MATH  MathSciNet  Google Scholar 

  23. Weiner, P.: Linear pattern matching algorithms. In: Proc. 14th Ann. IEEE Symp. Switching and Automata Theory, pp. 1–11 (1973)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Larsson, N.J., Fuglsang, K., Karlsson, K. (2014). Efficient Representation for Online Suffix Tree Construction. In: Gudmundsson, J., Katajainen, J. (eds) Experimental Algorithms. SEA 2014. Lecture Notes in Computer Science, vol 8504. Springer, Cham. https://doi.org/10.1007/978-3-319-07959-2_34

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-07959-2_34

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-07958-5

  • Online ISBN: 978-3-319-07959-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics