CPM 2003: Combinatorial Pattern Matching pp 55-69

# Fast Lightweight Suffix Array Construction and Checking

• Stefan Burkhardt
• Juha Kärkkäinen
Conference paper

DOI: 10.1007/3-540-44888-8_5

Volume 2676 of the book series Lecture Notes in Computer Science (LNCS)
Cite this paper as:
Burkhardt S., Kärkkäinen J. (2003) Fast Lightweight Suffix Array Construction and Checking. In: Baeza-Yates R., Chávez E., Crochemore M. (eds) Combinatorial Pattern Matching. CPM 2003. Lecture Notes in Computer Science, vol 2676. Springer, Berlin, Heidelberg

## Abstract

We describe an algorithm that, for any v ∈ [2, n], constructs the suffix array of a string of length n in $$\mathcal{O}\left( {vn + n log{\mathbf{ }}n} \right)$$ time using $$\mathcal{O}\left( {v + n/\sqrt v } \right)$$ space in addition to the input (the string) and the output (the suffix array). By setting v = log n, we obtain an $$\mathcal{O}\left( {n log n} \right)$$ time algorithm using $$\mathcal{O}\left( {n/\sqrt {log n} } \right)$$ extra space. This solves the open problem stated by Manzini and Ferragina [ESA ’02] of whether there exists a lightweight (sublinear extra space) $$\mathcal{O}\left( {n log n} \right)$$ time algorithm. The key idea of the algorithm is to first sort a sample of suffixes chosen using mathematical constructs called difference covers. The algorithm is not only lightweight but also fast in practice as demonstrated by experiments. Additionally, we describe fast and lightweight suffix array checkers, i.e., algorithms that check the correctness of a suffix array.

## Preview

Unable to display preview. Download preview PDF.