Advertisement

Linear Time Suffix Array Construction Using D-Critical Substrings

  • Ge Nong
  • Sen Zhang
  • Wai Hong Chan
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5577)

Abstract

In this paper we present in detail a new efficient linear time and space suffix array construction algorithm(SACA), called the D-Critical-Substring algorithm. The algorithm is built upon a novel concept called fixed-size D-Critical-Substrings, which allow us to compute suffix arrays through a balanced combination of the bucket-sort and the induction sort. The D-Critical-Substring algorithm is very simple, a fully-functioning sample implementation of which in C++ is embodied in only about 100 effective lines. The results of the experiment that we conducted on the data from the Canterbury and Manzini-Ferragina corpora indicate that our algorithm outperforms the two previously best-known linear time algorithms: the Kärkkäinen-Sanders (KS) and the Ko-Aluru (KA) algorithms.

Keywords

Linear Time Reduction Ratio Input String Suffix Array Array Construction 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Manber, U., Myers, G.: Suffix arrays: A new method for on-line string searches. In: Proceedings of the first ACM-SIAM Symposium on Discrete Algorithms, pp. 319–327 (1990)Google Scholar
  2. 2.
    Manzini, G., Ferragina, P.: Engineering a lightweight suffix array construction algorithm. Algorithmica 40(1), 33–50 (2004)MathSciNetCrossRefzbMATHGoogle Scholar
  3. 3.
    Grossi, R., Vitter, J.S.: Compressed suffix arrays and suffix trees with applications to text indexing and string matching. In: The 32nd Annual ACM Symposium on Theory of Computing (STOC 2000), pp. 397–406 (2000)Google Scholar
  4. 4.
    Kim, D.K., Sim, J.S., Park, H., Park, K.: Linear-time construction of suffix arrays. In: Baeza-Yates, R., Chávez, E., Crochemore, M. (eds.) CPM 2003. LNCS, vol. 2676, pp. 186–199. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  5. 5.
    Kärkkäinen, J., Sanders, P.: Simple linear work suffix array construction. In: Baeten, J.C.M., Lenstra, J.K., Parrow, J., Woeginger, G.J. (eds.) ICALP 2003. LNCS, vol. 2719, pp. 943–955. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  6. 6.
    Ko, P., Aluru, S.: Space efficient linear time construction of suffix arrays. In: Baeza-Yates, R., Chávez, E., Crochemore, M. (eds.) CPM 2003. LNCS, vol. 2676, pp. 200–210. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  7. 7.
    Farach, M.: Optimal suffix tree construction with large alphabets. In: FOCS 1997: Proceedings of the 38th Annual Symposium on Foundations of Computer Science (FOCS 1997), p. 137 (1997)Google Scholar
  8. 8.
  9. 9.
    Zhang, S., Nong, G.: Fast and space efficient linear suffix array construction. In: IEEE Data Compression Conference 2008 (DCC 2008), p. 553 (2008)Google Scholar
  10. 10.
    Nong, G., Zhang, S.: Optimal lightweight construction of suffix arrays. In: Dehne, F., Sack, J.-R., Zeh, N. (eds.) WADS 2007. LNCS, vol. 4619, pp. 613–624. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  11. 11.
    Sanders, P.: A driver program for the KS algorithm (2007), http://www.mpi-inf.mpg.de/~sanders/programs/suffix/
  12. 12.
    Ko, P.: Source codes for the KA algorithm (2007), http://kopang.public.iastate.edu/homepage.php?page=source
  13. 13.
    Puglisi, S.J., Smyth, W.F., Turpin, A.H.: A taxonomy of suffix array construction algorithms. ACM Comput. Surv. 39(2), 1–31 (2007)CrossRefGoogle Scholar
  14. 14.
    Lee, S., Park, K.: Efficient implementations of suffix array construction algorithms. In: Proceedings of the 15th Australasian Workshop on Combinatorial Algorithms, pp. 64–72 (2004)Google Scholar
  15. 15.
    Nong, G., Zhang, S., Chan, W.H.: Linear suffix array construction by almost pure induced-sorting. In: IEEE Data Compression Conference 2009 (DCC 2009), pp. 193–202 (2009)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Ge Nong
    • 1
  • Sen Zhang
    • 2
  • Wai Hong Chan
    • 3
  1. 1.Computer Science DepartmentSun Yat-Sen UniversityPeople's Republic of China
  2. 2.Dept. of Math., Comp. Sci. and Stat.SUNY College at OneontaU.S.A.
  3. 3.Department of MathematicsHong Kong Baptist UniversityHong KongHong Kong

Personalised recommendations