Advertisement

Fast Wavelet Tree Construction in Practice

  • Yusaku Kaneta
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11147)

Abstract

The wavelet tree is a compact data structure that supports various types of operations on a sequence of n integers in \([0,\sigma )\). Although Munro et al. (SPIRE 2014 and Theoretical Computer Science 2016) and Babenko et al. (SODA 2015) showed that wavelet trees can be constructed in \(O(n\lceil \lg \sigma /\sqrt{\lg n}\rceil )\) time, there has been no empirical study on their construction methods possibly due to its heavy use of precomputed tables, seemingly limiting its practicality. In this paper, we propose practical variants of their methods. Instead of using huge precomputed tables, we introduce new techniques based on broadword programming and special CPU instructions available for modern processors. Experiments on real-world texts demonstrated that our proposed methods were up to 2.2 and 4.5 times as fast as the naive ones for the wavelet tree and the wavelet matrix (a variant of wavelet trees), respectively, and up to 1.9 times as fast as a state of the art for the wavelet matrix.

Notes

Acknowledgements

I would like to thank the authors of [5] for making their code public, the anonymous reviewers for their helpful comments that greatly improved the correctness and presentation of this paper, and Kunihiko Sadakane for a presentation copy of his book that enhanced my understanding of this field.

References

  1. 1.
    Babenko, M., Gawrychowski, P., Kociumaka, T., Starikovskaya, T.: Wavelet trees meet suffix trees. In: Proceedings of the 26th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA 2015), pp. 572–591 (2015)Google Scholar
  2. 2.
    Claude, F., Navarro, G.: Practical rank/select queries over arbitrary sequences. In: Amir, A., Turpin, A., Moffat, A. (eds.) SPIRE 2008. LNCS, vol. 5280, pp. 176–187. Springer, Heidelberg (2008).  https://doi.org/10.1007/978-3-540-89097-3_18CrossRefGoogle Scholar
  3. 3.
    Claude, F., Navarro, G., Ordóñez, A.: The wavelet matrix: an efficient wavelet tree for large alphabets. Inf. Syst. 47, 15–32 (2015)CrossRefGoogle Scholar
  4. 4.
    Claude, F., Nicholson, P.K., Seco, D.: Space efficient wavelet tree construction. In: Grossi, R., Sebastiani, F., Silvestri, F. (eds.) SPIRE 2011. LNCS, vol. 7024, pp. 185–196. Springer, Heidelberg (2011).  https://doi.org/10.1007/978-3-642-24583-1_19CrossRefGoogle Scholar
  5. 5.
    Fischer, J., Kurpicz, F., Löbel, M.: Simple, fast and lightweight parallel wavelet tree construction. In: Proceedings of the 20th Workshop on Algorithm Engineering and Experiments (ALENEX 2018), pp. 9–20 (2018)CrossRefGoogle Scholar
  6. 6.
    Fuentes-Sepúlveda, J., Elejalde, E., Ferres, L., Seco, D.: Parallel construction of wavelet trees on multicore architectures. Knowl. Inf. Syst. 51(3), 1043–1066 (2017)CrossRefGoogle Scholar
  7. 7.
    Gog, S., Beller, T., Moffat, A., Petri, M.: From theory to practice: plug and play with succinct data structures. In: Gudmundsson, J., Katajainen, J. (eds.) SEA 2014. LNCS, vol. 8504, pp. 326–337. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-07959-2_28CrossRefGoogle Scholar
  8. 8.
    Gog, S., Petri, M.: Optimized succinct data structures for massive data. Softw. Pract. Exp. 44(11), 1287–1314 (2014)CrossRefGoogle Scholar
  9. 9.
    Grossi, R., Gupta, A., Vitter, J.S.: High-order entropy-compressed text indexes. In: Proceedings of the Fourteenth Annual ACM-SIAM Symposium on Discrete Algorithms (SODA 2003), pp. 841–850 (2003)Google Scholar
  10. 10.
  11. 11.
    Knuth, D.E.: The Art of Computer Programming, Volume 4, Fascicle 1: Bitwise Tricks & Techniques; Binary Decision Diagrams. Addison-Wesley Professional, Boston (2009)zbMATHGoogle Scholar
  12. 12.
    Labeit, J., Shun, J., Blelloch, G.E.: Parallel lightweight wavelet tree, suffix array and fm-index construction. J. Discret. Algorithms 43, 2–17 (2017)MathSciNetCrossRefGoogle Scholar
  13. 13.
    Leighton, F.T.: Introduction to Parallel Algorithms and Architectures: Array, Trees, Hypercubes. Morgan Kaufmann, Burlington (1992)zbMATHGoogle Scholar
  14. 14.
    Munro, J.I., Nekrich, Y., Vitter, J.S.: Fast construction of wavelet trees. Theor. Comput. Sci. 638(C), 91–97 (2016)MathSciNetCrossRefGoogle Scholar
  15. 15.
    Navarro, G.: Wavelet trees for all. J. Discret. Algorithms 25, 2–20 (2014)MathSciNetCrossRefGoogle Scholar
  16. 16.
    Navarro, G.: Compact Data Structures - A Practical Approach. Cambridge University Press, Cambridge (2016)CrossRefGoogle Scholar
  17. 17.
    Navarro, G., Providel, E.: Fast, small, simple rank/select on bitmaps. In: Klasing, R. (ed.) SEA 2012. LNCS, vol. 7276, pp. 295–306. Springer, Heidelberg (2012).  https://doi.org/10.1007/978-3-642-30850-5_26CrossRefGoogle Scholar
  18. 18.
    Pandey, P., Bender, M.A., Johnson, R., Patro, R.: A general-purpose counting filter: making every bit count. In: Proceedings of the 2017 ACM International Conference on Management of Data, (SIGMOD 2017), pp. 775–787 (2017)Google Scholar
  19. 19.
    Shun, J.: Parallel wavelet tree construction. In: Proceedings of the 2015 Data Compression Conference (DCC 2015), pp. 92–101 (2015)Google Scholar
  20. 20.
    Shun, J.: Improved parallel construction of wavelet trees and rank/select structures. In: Proceedings of the 2017 Data Compression Conference (DCC 2017), pp. 92–101 (2017)Google Scholar
  21. 21.
    Tischler, G.: On wavelet tree construction. In: Giancarlo, R., Manzini, G. (eds.) CPM 2011. LNCS, vol. 6661, pp. 208–218. Springer, Heidelberg (2011).  https://doi.org/10.1007/978-3-642-21458-5_19CrossRefGoogle Scholar
  22. 22.
    Vigna, S.: Broadword implementation of rank/select queries. In: McGeoch, C.C. (ed.) WEA 2008. LNCS, vol. 5038, pp. 154–168. Springer, Heidelberg (2008).  https://doi.org/10.1007/978-3-540-68552-4_12CrossRefGoogle Scholar
  23. 23.
    Zhou, D., Andersen, D.G., Kaminsky, M.: Space-efficient, high-performance rank and select structures on uncompressed bit sequences. In: Bonifaci, V., Demetrescu, C., Marchetti-Spaccamela, A. (eds.) SEA 2013. LNCS, vol. 7933, pp. 151–163. Springer, Heidelberg (2013).  https://doi.org/10.1007/978-3-642-38527-8_15CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.Rakuten Institute of TechnologyRakuten, Inc.SetagayaJapan

Personalised recommendations