Skip to main content

Linear-Time Construction of Suffix Arrays

Extended Abstract

  • Conference paper
  • First Online:
Combinatorial Pattern Matching (CPM 2003)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2676))

Included in the following conference series:

Abstract

The time complexity of suffix tree construction has been shown to be equivalent to that of sorting: O(n) for a constant-size alphabet or an integer alphabet and O(n log n) for a general alphabet. However, previous algorithms for constructing suffix arrays have the time complexity of O(n log n) even for a constant-size alphabet.

In this paper we present a linear-time algorithm to construct suffix arrays for integer alphabets, which do not use suffix trees as intermediate data structures during its construction. Since the case of a constant-size alphabet can be subsumed in that of an integer alphabet, our result implies that the time complexity of directly constructing suffix arrays matches that of constructing suffix trees.

Supported by KOSEF grant R01-2002-000-00589-0.

Supported by BK21 Project and IMT2000 Project AB02.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. M. Bender and M. Farach-Colton, The LCA Problem Revisited, In Proceedings of LATIN 2000, LNCS 1776, 88–94, 2000.

    Chapter  Google Scholar 

  2. O. Berkman and U. Vishkin, Recursive star-tree parallel data structure, SIAM J. Comput. 22 (1993), 221–242.

    Article  MATH  MathSciNet  Google Scholar 

  3. A. Blumer, J. Blumer, D. Haussler, A. Ehrenfeucht, M. T. Chen and J. Seiferas, The smallest automaton recognizing the subwords of a text, Theoret. Comput. Sci. 40 (1985), 31–55.

    Article  MATH  MathSciNet  Google Scholar 

  4. S. Burkhardt and J. Kärkkäinen, Fast lightweight suffix array construction and checking, Accepted to Symp. Combinatorial Pattern Matching (2003).

    Google Scholar 

  5. M. Crochemore, An optimal algorithm for computing the repetitions in a word, Inform. Processing Letters 12 (1981), 244–250.

    Article  MATH  MathSciNet  Google Scholar 

  6. M. Farach, Optimal suffix tree construction with large alphabets, IEEE Symp. Found. Computer Science (1997), 137–143.

    Google Scholar 

  7. M. Farach-Colton, P. Ferragina and S. Muthukrishnan, On the sorting-complexity of suffix tree construction, J. Assoc. Comput. Mach. 47 (2000), 987–1011.

    MATH  MathSciNet  Google Scholar 

  8. M. Farach and S. Muthukrishnan, Optimal logarithmic time randomized suffix tree construction, Int. Colloq. Automata Languages and Programming (1996), 550–561.

    Google Scholar 

  9. P. Ferragina and G. Manzini, Opportunistic data structures with applications, IEEE Symp. Found. Computer Science (2001), 390–398.

    Google Scholar 

  10. H.N. Gabow, J.L. Bentley, and R.E. Tarjan, Scaling and Related Techniques for Geometry Problems, ACM Symp. Theory of Computing (1984), 135–143.

    Google Scholar 

  11. G. Gonnet, R. Baeza-Yates, and T. Snider, New indices for text: Pat trees and pat arrays. In W. B. Frakes and R. A. Baeza-Yates, editors, Information Retrieval: Data Structures & Algorithms, Prentice Hall (1992), 66–82.

    Google Scholar 

  12. D. Gusfield, An “Increment-by-one” approach to suffix arrays and trees, manuscript 1990.

    Google Scholar 

  13. R. Grossi and J.S. Vitter, Compressed suffix arrays and suffix trees with applications to text indexing and string matching, ACM Symp. Theory of Computing (2000), 397–406.

    Google Scholar 

  14. D. Harel and R.E. Tarjan. Fast algorithms for finding nearest common ancestors, SIAM J. Comput. 13 (1984), 338–355.

    Article  MATH  MathSciNet  Google Scholar 

  15. R. Hariharan, Optimal parallel suffix tree construction, J. Comput. Syst. Sci. 55 (1997), 44–69.

    Article  MATH  MathSciNet  Google Scholar 

  16. J. Kärkkäinen and P. Sanders, Simpler linear work suffix array construction, Accepted to Int. Colloq. Automata Languages and Programming (2003).

    Google Scholar 

  17. P. Ko and S. Aluru, Space-efficient linear time construction of suffix arrays, Accepted to Symp. Combinatorial Pattern Matching (2003).

    Google Scholar 

  18. U. Manber and G. Myers, Suffix arrays: A new method for on-line string searches, SIAM J. Comput. 22 (1993), 935–938.

    Article  MATH  MathSciNet  Google Scholar 

  19. E.M. McCreight, A space-economical suffix tree construction algorithm, J. Assoc. Comput. Mach. 23 (1976), 262–272.

    MATH  MathSciNet  Google Scholar 

  20. J. I. Munro, V. Raman and S. Srinivasa Rao Space Efficient Suffix Trees, FST & TCS 18, in Lecture Notes in Computer Science, (Springer-Verlag), Dec. 1998.

    Google Scholar 

  21. K. Sadakane, Succinct representation of lcp information and improvement in the compressed suffix arrays, ACM-SIAM Symp. on Discrete Algorithms (2002), 225–232.

    Google Scholar 

  22. S.C. Sahinalp and U. Vishkin, Symmetry breaking for suffix tree construction, IEEE Symp. Found. Computer Science (1994), 300–309.

    Google Scholar 

  23. B. Schieber and U. Vishkin, On finding lowest common ancestors: simplification and parallelization, SIAM J. Comput. 17, (1988), 1253–1262.

    Article  MATH  MathSciNet  Google Scholar 

  24. E. Ukkonen, On-line construction of suffix trees, Algorithmica 14 (1995), 249–260.

    Article  MATH  MathSciNet  Google Scholar 

  25. J. Vuillemin, A unifying look at data structures, Comm. ACM Vol. 24, (1980), 229–239.

    Article  MathSciNet  Google Scholar 

  26. P. Weiner, Linear pattern matching algorithms, Proc. 14th IEEE Symp. Switching and Automata Theory (1973), 1–11.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Kim, D.K., Sim, J.S., Park, H., Park, K. (2003). Linear-Time Construction of Suffix Arrays. In: Baeza-Yates, R., Chávez, E., Crochemore, M. (eds) Combinatorial Pattern Matching. CPM 2003. Lecture Notes in Computer Science, vol 2676. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44888-8_14

Download citation

  • DOI: https://doi.org/10.1007/3-540-44888-8_14

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-40311-1

  • Online ISBN: 978-3-540-44888-4

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics