Abstract
The wavelet tree has become a very useful data structure to efficiently represent and query large volumes of data in many different domains, from bioinformatics to geographic information systems. One problem with wavelet trees is their construction time. In this paper, we introduce two algorithms that reduce the time complexity of a wavelet tree’s construction by taking advantage of nowadays ubiquitous multicore machines. Our first algorithm constructs all the levels of the wavelet in parallel with O(n) time and \(O(n\lg \sigma + \sigma \lg n)\) bits of working space, where n is the size of the input sequence and \(\sigma \) is the size of the alphabet. Our second algorithm constructs the wavelet tree in a domain decomposition fashion, using our first algorithm in each segment, reaching \(O(\lg n)\) time and \(O(n\lg \sigma + p\sigma \lg n/\lg \sigma )\) bits of extra space, where p is the number of available cores. Both algorithms are practical and report good speedup for large real datasets.
Similar content being viewed by others
Notes
We use \(\lg x = \log _2 x\).
Notice that the RAM model is a subset of the DYM model where the outdegree of every vertex \(v \in V\) is \({\le }1\).
We also tested a new version of Libcds called Libcds2; however, the former had better running times for the construction of wtrees.
http://pizzachili.dcc.uchile.cl/texts/protein/proteins.gz (April, 2015).
http://pizzachili.dcc.uchile.cl/texts/code/sources.gz (April, 2015).
In order to be less sensitive to outliers, we use the median time instead of other statistics. In our experiments, the pwt algorithm showed a larger deviation with respect to the number of threads than the other algorithms. However, the differences were not statistically significant.
A complete report of running times and everything needed to replicate these results is available at www.inf.udec.cl/~josefuentes/wavelettree.
The Unicode Consortium: http://www.unicode.org/.
The construction times of shun with the src.2GB dataset exceeds 1 h. To make the algorithms in the figures comparable, we report the running times for the dataset src.1GB.
The computer tested is a dual-processor \(\hbox {Intel}^{\circledR }\) \(\hbox {Xeon}^{\circledR }\) CPU (E5645) with six cores per processor, for a total of 12 physical cores running at 2.50GHz. Hyperthreading was disabled. The computer runs Linux 3.5.0-17-generic, in 64-bit mode. This machine has per-core L1 and L2 caches of sizes 32KB and 256KB, respectively, and 1 per-processor shared L3 cache of 12MB, with a 5,958MB (\(\sim \hbox {6GB}\)) DDR3 RAM.
To ensure the constant access cost, we use the numactl command with “interleave \(=\) all” option. The command allocates the memory using round robin on the NUMA nodes.
References
Arroyuelo D, Costa VG, González S, Marín M, Oyarzún M (2012) Distributed search based on self-indexed compressed text. Inf Process Manag 48(5):819–827. doi:10.1016/j.ipm.2011.01.008
Bingmann T (2015) malloc_count—tools for runtime memory usage analysis and profiling. http://panthema.net/2013/malloc_count/ (2013). Last accessed: 17 Jan 2015
Blumofe RD, Leiserson CE (1999) Scheduling multithreaded computations by work stealing. J ACM 46(5):720–748. doi:10.1145/324133.324234
Brisaboa NR, Luaces MR, Navarro G, Seco D (2013) Space-efficient representations of rectangle datasets supporting orthogonal range querying. Inf Syst 38(5):635–655. doi:10.1016/j.is.2013.01.005
Burrows M, Wheeler DJ (1994) A block-sorting lossless data compression algorithm. Tech. rep., Digital Equipment Corporation
Claude F (2011) A compressed data structure library. https://github.com/fclaude/libcds. Last accessed: 13 August 2015
Claude F, Navarro G (2009) Practical rank/select queries over arbitrary sequences. In: SPIRE. Springer, Berlin, pp 176–187. doi:10.1007/978-3-540-89097-3_18
Claude F, Navarro G (2012) The wavelet matrix. In: SPIRE, vol 7608. Springer, Berlin, pp 167–179. doi:10.1007/978-3-642-34109-0_18
Claude F, Nicholson PK, Seco D (2011) Space efficient wavelet tree construction. In: SPIRE, vol 7024. Springer, Berlin, pp 185–196
Cormen TH, Leiserson CE, Rivest RL, Stein C (2009) Introduction to algorithms, 3rd edn., chap. Multithreaded algorithms. The MIT Press, Cambridge, pp 772–812
Faro S, Külekci MO (2012) Fast multiple string matching using streaming SIMD extensions technology. In: SPIRE. Springer, Berlin, pp 217–228. doi:10.1007/978-3-642-34109-0_23
Ferragina P, Manzini G (2000) Opportunistic data structures with applications. In: Proceedings of the 41st annual symposium on foundations of computer science, FOCS ’00. IEEE Computer Society, Washington, DC, USA, p 390. http://dl.acm.org/citation.cfm?id=795666.796543
Ferragina P, Manzini G, Mäkinen V, Navarro G (2004) String processing and information retrieval: 11th international conference, SPIRE 2004, Padova, Italy, 5–8 October 2004. Proceedings, chap. An Alphabet-Friendly FM-Index. Springer, Berlin, pp 150–160. doi:10.1007/978-3-540-30213-1_23
Ferragina P, Manzini G, Mäkinen V, Navarro G (2007) Compressed representations of sequences and full-text indexes. ACM Trans Algorithms 3(2):20. doi:10.1145/1240233.1240243
Fuentes-Sepúlveda J, Elejalde E, Ferres L, Seco D (2014) Efficient wavelet tree construction and querying for multicore architectures. In: Gudmundsson J, Katajainen J (eds) Experimental algorithms, Lecture Notes in Computer Science, vol 8504. Springer, Berlin, pp 150–161. doi:10.1007/978-3-319-07959-2_13
Gog S (2015) Succinct data structure library 2.0. https://github.com/simongog/sdsl-lite (2012). Last accessed: 17 Jan 2015
González R, Grabowski S, Mäkinen V, Navarro G (2005) Practical implementation of rank and select queries. In: WEA. CTI Press, Greece, pp 27–38. Poster
Grossi R, Gupta A, Vitter JS (2003) High-order entropy-compressed text indexes. In: SODA. Soc. Ind. Appl. Math., Philadelphia, pp 841–850
Helman DR, JáJá J (2001) Prefix computations on symmetric multiprocessors. J Parallel Distrib Comput 61(2):265–278. doi:10.1006/jpdc.2000.1678
Illumina, Inc. (2016) An introduction to next-generation sequencing technology. http://www.illumina.com/content/dam/illumina-marketing/documents/products/illumina_sequencing_introduction.pdf
Ladra S, Pedreira O, Duato J, Brisaboa NR (2012) Exploiting SIMD instructions in current processors to improve classical string algorithms. In: ADBIS. Springer, Berlin, pp 254–267. doi:10.1007/978-3-642-33074-2_19
Makris C (2012) Wavelet trees: a survey. Comput Sci Inf Syst 9(2):585–625
Matsumoto M, Nishimura T (1998) Mersenne twister: a 623-dimensionally equidistributed uniform pseudo-random number generator. ACM Trans Model Comput Simul 8(1):3–30. doi:10.1145/272991.272995
Navarro G (2012) Wavelet trees for all. In: CPM. Springer, Berlin, pp 2–26. doi:10.1007/978-3-642-31265-6_2
Navarro G, Nekrich Y, Russo LMS (2013) Space-efficient data-analysis queries on grids. Theor Comput Sci 482:60–72. doi:10.1016/j.tcs.2012.11.031
Pantaleoni J, Subtil N (2016) Nvbio library. http://nvlabs.github.io/nvbio/index.html. Accessed 12 April 2016
Raman R, Raman V, Satti SR (2007) Succinct indexable dictionaries with applications to encoding k-ary trees, prefix sums and multisets. ACM Trans Algorithms 3(4):43. doi:10.1145/1290672.1290680
Schnattinger T, Ohlebusch E, Gog S (2012) Bidirectional search in a string with wavelet trees and bidirectional matching statistics. Inf Comput 213:13–22. doi:10.1016/j.ic.2011.03.007. http://www.sciencedirect.com/science/article/pii/S0890540112000235. Special Issue: Combinatorial Pattern Matching (CPM 2010)
Shun J (2015) Parallel wavelet tree construction. In: Proceedings of the IEEE data compression conference, Utah, USA, pp 63–72. doi:10.1109/DCC.2015.7
Singer J (2012) A wavelet tree based fm-index for biological sequences in SeqAn. Master’s thesis, Freie Universität Berlin. http://www.mi.fu-berlin.de/wiki/pub/ABI/FMIndexThesis/FMIndex.pdf
Tischler G (2011) On wavelet tree construction. In: CPM. Springer, Berlin, pp 208–218
Touati SAA, Worms J, Briais S (2013) The Speedup-Test: a statistical methodology for program speedup analysis and computation. Concurr Comput Pract Exp 25(10):1410–1426. doi:10.1002/cpe.2939. https://hal.inria.fr/hal-00764454. Article first published online: 15 Oct 2012
Välimäki N, Mäkinen V (2007) Space-efficient algorithms for document retrieval. In: CPM, LNCS, vol. 4580. Springer, Berlin, pp 205–215. doi:10.1007/978-3-540-73437-6_22
Wetterstrand KA (2016) DNA sequencing costs: data from the NHGRI genome sequencing program (GSP). http://www.genome.gov/sequencingcosts. Accessed 12 April 2016
Acknowledgments
This work was supported in part by the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie Grant Agreement No. 690941 and the doctoral scholarships of CONICYT Nos. 21120974 and 63130228 (first and second authors, respectively). We also would like to thank Roberto Asín for making his multicore computers, Mastropiero and Günther Frager, available to us.
Author information
Authors and Affiliations
Corresponding authors
Additional information
A previous version of this paper appeared in the 13th International Symposium on Experimental Algorithms (SEA 2014) [15].
Rights and permissions
About this article
Cite this article
Fuentes-Sepúlveda, J., Elejalde, E., Ferres, L. et al. Parallel construction of wavelet trees on multicore architectures. Knowl Inf Syst 51, 1043–1066 (2017). https://doi.org/10.1007/s10115-016-1000-6
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-016-1000-6