Skip to main content

Parallel Suffix Array Construction for Shared Memory Architectures

  • Conference paper
String Processing and Information Retrieval (SPIRE 2012)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7608))

Included in the following conference series:

Abstract

We present the design of the algorithm for constructing the suffix array of a string using manycore GPUs. Despite of the wide usage in text processing and extensive research over two decades there was a lack of efficient algorithms that were able to exploit shared memory parallelism (as multicore CPUs as manycore GPUs) in practice. To the best of our knowledge we developed the first approach exposing shared memory parallelism that significantly outperforms the state-of-the-art existing implementations for sufficiently large inputs. We reduced the suffix array construction problem to a number of parallel primitives such as prefix-sum, radix sorting, random gather and scatter from/to the memory. Thus, the performance of the algorithm merely depends on the performance of these primitives on the particular shared memory architecture. We demonstrate its performance on manycore GPUs, but the method can also be applied for other parallel architectures, such as multicores, CELL or Intel MIC.

Partially supported by EU Project No. 248481 (PEPPHER) ICT-2009.3.6 and DFG grant SA 933/3-2

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Beckmann, A., Dementiev, R., Singler, J.: Building a parallel pipelined external memory algorithm library. In: Proc. Int’l Symposium on Parallel & Distributed Processing (IPDPS), pp. 1–10 (May 2009)

    Google Scholar 

  2. Dementiev, R., Kärkkäinen, J., Mehnert, J., Sanders, P.: Better external memory suffix array construction. J. Exp. Algorithmics 12, 3.4:1–3.4:24 (2008)

    Google Scholar 

  3. Futamura, N., Aluru, S., Kurtz, S.: Parallel suffix sorting. Electrical Engineering and Computer Science, paper 64 (2001)

    Google Scholar 

  4. Kärkkäinen, J., Sanders, P., Burkhardt, S.: Linear work suffix array construction. J. ACM 53(6), 918–936 (2006)

    Article  MathSciNet  Google Scholar 

  5. Ko, P., Aluru, S.: Space Efficient Linear Time Construction of Suffix Arrays. In: Baeza-Yates, R., Chávez, E., Crochemore, M. (eds.) CPM 2003. LNCS, vol. 2676, pp. 200–210. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  6. Kulla, F., Sanders, P.: Scalable parallel suffix array construction. Parallel Computing 33(9), 605–612 (2007)

    Article  Google Scholar 

  7. Larsson, N.J., Sadakane, K.: Faster suffix sorting. Theoretical Computer Science 387(3), 258–272 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  8. Leischner, N., Osipov, V., Sanders, P.: GPU sample sort. In: Proc. of the IEEE Int’l Symposium on Parallel & Distributed Processing (IPDPS), pp. 1–10 (April 2010)

    Google Scholar 

  9. Manber, U., Myers, G.: Suffix arrays: a new method for on-line string searches. In: Proc. of the 1st Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), Philadelphia, PA, USA, pp. 319–327 (1990)

    Google Scholar 

  10. Merrill, D., Grimshaw, A.S.: High performance and scalable radix sorting: a case study of implementing dynamic parallelism for gpu computing. Parallel Processing Letters 21(2), 245–272 (2011)

    Article  MathSciNet  Google Scholar 

  11. Mori, Y.: Suffix array construction algorithms benchmark set, http://code.google.com/p/libdivsufsort/wiki/SACA_Benchmarks

  12. Nong, G., Zhang, S., Chan, W.H.: Linear suffix array construction by almost pure Induced-Sorting. In: Proc. of Data Compression Conference (DCC), pp. 193–202. IEEE (March 2009)

    Google Scholar 

  13. Puglisi, S.J., Smyth, W.F., Turpin, A.H.: A taxonomy of suffix array construction algorithms. ACM Computing Surveys 39(2) (July 2007)

    Google Scholar 

  14. Satish, N., Harris, M., Garland, M.: Designing efficient sorting algorithms for manycore gpus. In: Proc. Int’l Symposium on Parallel & Distributed Processing, IPDPS (2009)

    Google Scholar 

  15. Satish, N., Kim, C., Chhugani, J., Nguyen, A.D., Lee, V.W., Kim, D., Dubey, P.: Fast sort on CPUs and GPUs: a case for bandwidth oblivious SIMD sort. In: Proc. of the Int’l conference on Management of Data, pp. 351–362. ACM (2010)

    Google Scholar 

  16. Sun, W., Ma, Z.: Parallel lexicographic names construction with CUDA. In: Proc. of the 15th International Conference on Parallel and Distributed Systems (ICPADS), pp. 913–918 (December 2009)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Osipov, V. (2012). Parallel Suffix Array Construction for Shared Memory Architectures. In: Calderón-Benavides, L., González-Caro, C., Chávez, E., Ziviani, N. (eds) String Processing and Information Retrieval. SPIRE 2012. Lecture Notes in Computer Science, vol 7608. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-34109-0_40

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-34109-0_40

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-34108-3

  • Online ISBN: 978-3-642-34109-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics