Advertisement

Linear-Time Longest-Common-Prefix Computation in Suffix Arrays and Its Applications

  • Toru Kasai
  • Gunho Lee
  • Hiroki Arimura
  • Setsuo Arikawa
  • Kunsoo Park
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2089)

Abstract

We present a linear-time algorithm to compute the longest common prefix information in suffix arrays. As two applications of our algorithm, we show that our algorithm is crucial to the effective use of block-sorting compression, and we present a linear-time algorithm to sim- ulate the bottom-up traversal of a suffix tree with a suffix array combined with the longest common prefix information.

Keywords

Internal Node Suffix Array Text Database Lower Common Ancestor Compact Tree 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    A.V. Aho, J.E. Hopcroft and U.D. Ullman, Data Structures and Algorithms, Addison-Wesley, 1983.Google Scholar
  2. 2.
    H. Arimura, S. Arikawa and S. Shimozono, Efficient discovery of optimal word-association patterns in large text databases, New Generation Comput., 18, 49–60, 2000.CrossRefGoogle Scholar
  3. 3.
    H. Arimura, H. Asaka, H. Sakamoto and S. Arikawa, Efficient discovery of proximity patterns with suffix arrays, In Proc. CPM 2001, Poster paper, LNCS, Springer-Verlag, 2001. (In this volumn).Google Scholar
  4. 4.
    M. Burrows and D.J. Wheeler, A block-sorting lossless data compression algorithm, Digital Systems Research Center Research Report 124, 1994.Google Scholar
  5. 5.
    M. Farach-Colton, P. Ferragina and S. Muthukrishnan, On the sorting-complexity of suffix tree construction, Journal of the ACM, Vol.47,No.6, 987–1011, 2000.MathSciNetzbMATHCrossRefGoogle Scholar
  6. 6.
    P. Ferragina and G. Manzini, Opportunistic data structures with applications, In Proc. 41st IEEE Symposium on Foundations of Computer Science, 390–398 2000.Google Scholar
  7. 7.
    P. Ferragina and G. Manzini, An experimental study of an opportunistic index, In Proc. 12th ACM-SIAM Symposium on Discrete Algorithms, 269–278 2001.Google Scholar
  8. 8.
    P. Fenwick, Block sorting text compression, In Proc. Australian Computer Science Communications, 18(1), 193–202, 1996.Google Scholar
  9. 9.
    R. Fujino, H. Arimura and S. Arikawa, Discovering unordered and ordered phrase association patterns for text mining, In Proc. PAKDD2000, LNAI 1805, 281–293, 2000.Google Scholar
  10. 10.
    R. Grossi and J.S. Vitter, Compressed suffix arrays and suffix trees with applications to text indexing and string matching, In Proc. 32nd ACM Symposium on Theory of Computing, 397–406, 2000.Google Scholar
  11. 11.
    D. Gusfield, An increment-by-one approach to suffix arrays and trees, Technical Report CSE-90-39, UC Davis, Dept. Computer Science, 1990.Google Scholar
  12. 12.
    D. Gusfield, Algorithms on Strings, Trees, and Sequences: Computer Science and Computational Biology, Cambridge University Press, New York, 1997.zbMATHCrossRefGoogle Scholar
  13. 13.
    R. Harris, Abstract Index, Monash Univ (1998).Google Scholar
  14. 14.
    T. Kasai, H. Arimura and S. Arikawa, Efficient substring traversal with suffix arrays, DOI-TR 185, Feb. 2001. (First appeared as T. Kasai, Fast algorithms for the subword statistics problems with suffix arrays, Mc. Thesis, Dept. Informatics, Kyushu Univ.,1999, In Japanese.Google Scholar
  15. 15.
    S.E. Lee and K. Park, A new algorithm for constructing suffix arrays, Journal of Korea Information Science Society (A), 24(7), 697–704, 1997.Google Scholar
  16. 16.
    U. Manber and G. Myers, Suffix arrays: A new method for on-line string searches, SIAM J. Computing, 22(5), 935–948 (1993).MathSciNetzbMATHCrossRefGoogle Scholar
  17. 17.
    E.M. McCreight, A space-economical suffix tree construction algorithm, Journal of the ACM, 23(2), 262–272, 1976.MathSciNetzbMATHCrossRefGoogle Scholar
  18. 18.
    K. Sadakane and H. Imai, A cooperative distributed text database management method unifying search and compression based on the Burrows-Wheeler transformation, In Proc. International Workshop on New Database Technologies for Collaborative Work Support and Spatio-Temporal Data Management, 434–445, 1998.Google Scholar
  19. 19.
    K. Sadakane, A modified Burrows-Wheeler transformation for case-insensitive search with application to suffix array compression, In Proc. Data Compression Conference, p.548, 1999.Google Scholar
  20. 20.
    K. Sadakane, Compressed text databases with efficient query algorithms based on the compressed suffix array, In Proc. 11th Annual International Symposium on Algorithms and Computation, 410–421, 2000.Google Scholar
  21. 21.
  22. 22.
    J. Stoye and D. Gusfield, Simple and flexible detection of contiguous repeats using a suffix tree, In Proc. CPM’98, LNCS, 140–152, 1998.Google Scholar
  23. 23.
    E. Ukkonen, On-line construction of suffix trees, Algorithmica 14, 249–260, 1995.MathSciNetzbMATHCrossRefGoogle Scholar
  24. 24.
    J.S. Vitter, External memory algorithms, In Proc. PODS’98, 119–128 (1998).Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2001

Authors and Affiliations

  • Toru Kasai
    • 1
  • Gunho Lee
    • 2
  • Hiroki Arimura
    • 1
    • 3
  • Setsuo Arikawa
    • 1
  • Kunsoo Park
    • 2
  1. 1.Department of InformaticsKyushu UniversityFukuokaJapan
  2. 2.School of Computer Science and EngineeringSeoul National UniversitySeoulKorea
  3. 3.PRESTO, Japan Science and Technology CorporationJapan

Personalised recommendations