Efficient Wavelet Tree Construction and Querying for Multicore Architectures

  • José Fuentes-Sepúlveda
  • Erick Elejalde
  • Leo Ferres
  • Diego Seco
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8504)

Abstract

Wavelet trees have become very useful to handle large data sequences efficiently. By the same token, in the last decade, multicore architectures have become ubiquitous, and parallelism in general has become extremely important in order to gain performance. This paper introduces two practical multicore algorithms for wavelet tree construction that run in O(n) time using \(\lg \sigma\) processors, where n is the size of the input and σ the alphabet size. Both algorithms have efficient memory consumption. We also present a querying technique based on batch processing that improves on simple domain-decomposition techniques.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Arroyuelo, D., Costa, V.G., González, S., Marín, M., Oyarzún, M.: Distributed search based on self-indexed compressed text. Inf. Process. Manag. 48(5), 819–827 (2012)CrossRefGoogle Scholar
  2. 2.
    Blumofe, R.D., Leiserson, C.E.: Scheduling multithreaded computations by work stealing. J. ACM 46(5), 720–748 (1999)CrossRefMATHMathSciNetGoogle Scholar
  3. 3.
    Brisaboa, N.R., Luaces, M.R., Navarro, G., Seco, D.: Space-efficient representations of rectangle datasets supporting orthogonal range querying. Inf. Syst. 38(5), 635–655 (2013)CrossRefGoogle Scholar
  4. 4.
    Claude, F., Navarro, G.: Practical rank/select queries over arbitrary sequences. In: Amir, A., Turpin, A., Moffat, A. (eds.) SPIRE 2008. LNCS, vol. 5280, pp. 176–187. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  5. 5.
    Claude, F., Navarro, G.: The wavelet matrix. In: Calderón-Benavides, L., González-Caro, C., Chávez, E., Ziviani, N. (eds.) SPIRE 2012. LNCS, vol. 7608, pp. 167–179. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  6. 6.
    Claude, F., Nicholson, P.K., Seco, D.: Space efficient wavelet tree construction. In: Grossi, R., Sebastiani, F., Silvestri, F. (eds.) SPIRE 2011. LNCS, vol. 7024, pp. 185–196. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  7. 7.
    Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C.: Multithreaded Algorithms. In: Introduction to Algorithms, 3rd edn., chap. pp. 772–812. The MIT Press (2009)Google Scholar
  8. 8.
    Faro, S., Külekci, M.O.: Fast multiple string matching using streaming SIMD extensions technology. In: Calderón-Benavides, L., González-Caro, C., Chávez, E., Ziviani, N. (eds.) SPIRE 2012. LNCS, vol. 7608, pp. 217–228. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  9. 9.
    Ferragina, P., Manzini, G., Mäkinen, V., Navarro, G.: Compressed representations of sequences and full-text indexes. ACM Trans. Algorithms 3(2) (2007)Google Scholar
  10. 10.
    Gagie, T., Navarro, G., Puglisi, S.J.: New algorithms on wavelet trees and applications to information retrieval. Theoret. Comput. Sci. 427, 25–41 (2012)CrossRefMathSciNetGoogle Scholar
  11. 11.
    González, R., Grabowski, S., Mäkinen, V., Navarro, G.: Practical implementation of rank and select queries. In: WEA, pp. 27–38. CTI Press, Greece (2005)Google Scholar
  12. 12.
    Grossi, R., Gupta, A., Vitter, J.S.: High-order entropy-compressed text indexes. In: SODA, pp. 841–850. Soc. Ind. Appl. Math, Philadelphia (2003)Google Scholar
  13. 13.
    Konow, R., Navarro, G.: Dual-sorted inverted lists in practice. In: Calderón-Benavides, L., González-Caro, C., Chávez, E., Ziviani, N. (eds.) SPIRE 2012. LNCS, vol. 7608, pp. 295–306. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  14. 14.
    Ladra, S., Pedreira, O., Duato, J., Brisaboa, N.R.: Exploiting SIMD Instructions in Current Processors to Improve Classical String Algorithms. In: Morzy, T., Härder, T., Wrembel, R. (eds.) ADBIS 2012. LNCS, vol. 7503, pp. 254–267. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  15. 15.
    Mäkinen, V., Navarro, G.: Rank and select revisited and extended. Theoret. Comput. Sci. 387(3), 332–347 (2007)CrossRefMATHMathSciNetGoogle Scholar
  16. 16.
    Makris, C.: Wavelet trees: A survey. Comput. Sci. Inf. Syst. 9(2), 585–625 (2012)CrossRefGoogle Scholar
  17. 17.
    Matsumoto, M., Nishimura, T.: Mersenne twister: a 623-dimensionally equidistributed uniform pseudo-random number generator. ACM Trans. Model. Comput. Simul. 8(1), 3–30 (1998)CrossRefMATHGoogle Scholar
  18. 18.
    Navarro, G.: Wavelet trees for all. In: Kärkkäinen, J., Stoye, J. (eds.) CPM 2012. LNCS, vol. 7354, pp. 2–26. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  19. 19.
    Navarro, G., Nekrich, Y., Russo, L.M.S.: Space-efficient data-analysis queries on grids. Theoret. Comput. Sci. 482, 60–72 (2013)CrossRefMATHMathSciNetGoogle Scholar
  20. 20.
    Otellini, P.: Keynote Speech at Intel Developer Forum (2003), http://www.intel.com/pressroom/archive/speeches/otellini20030916.htm
  21. 21.
    Raman, R., Raman, V., Satti, S.R.: Succinct indexable dictionaries with applications to encoding k-ary trees, prefix sums and multisets. ACM Trans. Algorithms 3(4) (2007)Google Scholar
  22. 22.
    Sutter, H.: The free lunch is over: A fundamental turn toward concurrency in software (2005), http://www.gotw.ca/publications/concurrency-ddj.htm
  23. 23.
    Tischler, G.: On wavelet tree construction. In: Giancarlo, R., Manzini, G. (eds.) CPM 2011. LNCS, vol. 6661, pp. 208–218. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  24. 24.
    Välimäki, N., Mäkinen, V.: Space-efficient algorithms for document retrieval. In: Ma, B., Zhang, K. (eds.) CPM 2007. LNCS, vol. 4580, pp. 205–215. Springer, Heidelberg (2007)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • José Fuentes-Sepúlveda
    • 1
  • Erick Elejalde
    • 1
  • Leo Ferres
    • 1
  • Diego Seco
    • 1
  1. 1.Universidad de ConcepciónConcepciónChile

Personalised recommendations