Linear-Time Longest-Common-Prefix Computation in Suffix Arrays and Its Applications

Kasai, Toru; Lee, Gunho; Arimura, Hiroki; Arikawa, Setsuo; Park, Kunsoo

doi:10.1007/3-540-48194-X_17

Toru Kasai⁶,
Gunho Lee⁷,
Hiroki Arimura^6,8,
Setsuo Arikawa⁶ &
…
Kunsoo Park⁷

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2089))

Included in the following conference series:

Annual Symposium on Combinatorial Pattern Matching

1449 Accesses
203 Citations
5 Altmetric

Abstract

We present a linear-time algorithm to compute the longest common prefix information in suffix arrays. As two applications of our algorithm, we show that our algorithm is crucial to the effective use of block-sorting compression, and we present a linear-time algorithm to sim- ulate the bottom-up traversal of a suffix tree with a suffix array combined with the longest common prefix information.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

A.V. Aho, J.E. Hopcroft and U.D. Ullman, Data Structures and Algorithms, Addison-Wesley, 1983.
Google Scholar
H. Arimura, S. Arikawa and S. Shimozono, Efficient discovery of optimal word-association patterns in large text databases, New Generation Comput., 18, 49–60, 2000.
Article Google Scholar
H. Arimura, H. Asaka, H. Sakamoto and S. Arikawa, Efficient discovery of proximity patterns with suffix arrays, In Proc. CPM 2001, Poster paper, LNCS, Springer-Verlag, 2001. (In this volumn).
Google Scholar
M. Burrows and D.J. Wheeler, A block-sorting lossless data compression algorithm, Digital Systems Research Center Research Report 124, 1994.
Google Scholar
M. Farach-Colton, P. Ferragina and S. Muthukrishnan, On the sorting-complexity of suffix tree construction, Journal of the ACM, Vol.47,No.6, 987–1011, 2000.
Article MathSciNet MATH Google Scholar
P. Ferragina and G. Manzini, Opportunistic data structures with applications, In Proc. 41st IEEE Symposium on Foundations of Computer Science, 390–398 2000.
Google Scholar
P. Ferragina and G. Manzini, An experimental study of an opportunistic index, In Proc. 12th ACM-SIAM Symposium on Discrete Algorithms, 269–278 2001.
Google Scholar
P. Fenwick, Block sorting text compression, In Proc. Australian Computer Science Communications, 18(1), 193–202, 1996.
Google Scholar
R. Fujino, H. Arimura and S. Arikawa, Discovering unordered and ordered phrase association patterns for text mining, In Proc. PAKDD2000, LNAI 1805, 281–293, 2000.
Google Scholar
R. Grossi and J.S. Vitter, Compressed suffix arrays and suffix trees with applications to text indexing and string matching, In Proc. 32nd ACM Symposium on Theory of Computing, 397–406, 2000.
Google Scholar
D. Gusfield, An increment-by-one approach to suffix arrays and trees, Technical Report CSE-90-39, UC Davis, Dept. Computer Science, 1990.
Google Scholar
D. Gusfield, Algorithms on Strings, Trees, and Sequences: Computer Science and Computational Biology, Cambridge University Press, New York, 1997.
Book MATH Google Scholar
R. Harris, Abstract Index, Monash Univ (1998).
Google Scholar
T. Kasai, H. Arimura and S. Arikawa, Efficient substring traversal with suffix arrays, DOI-TR 185, Feb. 2001. (First appeared as T. Kasai, Fast algorithms for the subword statistics problems with suffix arrays, Mc. Thesis, Dept. Informatics, Kyushu Univ.,1999, In Japanese.
Google Scholar
S.E. Lee and K. Park, A new algorithm for constructing suffix arrays, Journal of Korea Information Science Society (A), 24(7), 697–704, 1997.
Google Scholar
U. Manber and G. Myers, Suffix arrays: A new method for on-line string searches, SIAM J. Computing, 22(5), 935–948 (1993).
Article MathSciNet MATH Google Scholar
E.M. McCreight, A space-economical suffix tree construction algorithm, Journal of the ACM, 23(2), 262–272, 1976.
Article MathSciNet MATH Google Scholar
K. Sadakane and H. Imai, A cooperative distributed text database management method unifying search and compression based on the Burrows-Wheeler transformation, In Proc. International Workshop on New Database Technologies for Collaborative Work Support and Spatio-Temporal Data Management, 434–445, 1998.
Google Scholar
K. Sadakane, A modified Burrows-Wheeler transformation for case-insensitive search with application to suffix array compression, In Proc. Data Compression Conference, p.548, 1999.
Google Scholar
K. Sadakane, Compressed text databases with efficient query algorithms based on the compressed suffix array, In Proc. 11th Annual International Symposium on Algorithms and Computation, 410–421, 2000.
Google Scholar
J. Seward, http://www.sources.redhat.com/bzip2
J. Stoye and D. Gusfield, Simple and flexible detection of contiguous repeats using a suffix tree, In Proc. CPM’98, LNCS, 140–152, 1998.
Google Scholar
E. Ukkonen, On-line construction of suffix trees, Algorithmica 14, 249–260, 1995.
Article MathSciNet MATH Google Scholar
J.S. Vitter, External memory algorithms, In Proc. PODS’98, 119–128 (1998).
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Informatics, Kyushu University, Fukuoka, 812-8581, Japan
Toru Kasai, Hiroki Arimura & Setsuo Arikawa
School of Computer Science and Engineering, Seoul National University, Seoul, 151-742, Korea
Gunho Lee & Kunsoo Park
PRESTO, Japan Science and Technology Corporation, Japan
Hiroki Arimura

Authors

Toru Kasai
View author publications
You can also search for this author in PubMed Google Scholar
Gunho Lee
View author publications
You can also search for this author in PubMed Google Scholar
Hiroki Arimura
View author publications
You can also search for this author in PubMed Google Scholar
Setsuo Arikawa
View author publications
You can also search for this author in PubMed Google Scholar
Kunsoo Park
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, Bar-Ilan University, 52900, Ramat-Gan, Israel, Atlanta, Georgia, 30332-0280, USA
Amihood Amir

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kasai, T., Lee, G., Arimura, H., Arikawa, S., Park, K. (2001). Linear-Time Longest-Common-Prefix Computation in Suffix Arrays and Its Applications. In: Amir, A. (eds) Combinatorial Pattern Matching. CPM 2001. Lecture Notes in Computer Science, vol 2089. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-48194-X_17

Download citation

DOI: https://doi.org/10.1007/3-540-48194-X_17
Published: 13 June 2001
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42271-6
Online ISBN: 978-3-540-48194-2
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics