BDAS 2017: Beyond Databases, Architectures and Structures. Towards Efficient Solutions for Data Analysis and Knowledge Representation pp 235-245 | Cite as
Sorting Data on Ultra-Large Scale with RADULS
Abstract
The paper introduces RADULS, a new parallel sorter based on radix sort algorithm, intended to organize ultra-large data sets efficiently. For example 4 G 16-byte records can be sorted with 16 threads in less than 15 s on Intel Xeon-based workstation. The implementation of RADULS is not only highly optimized to gain such an excellent performance, but also parallelized in a cache friendly manner to make the most of modern multicore architectures. Besides, our parallel scheduler launches a few different procedures at runtime, according to the current parameters of the execution, for proper workload management. All experiments show RADULS to be superior to competing algorithms.
Keywords
Radix sort Thread-level parallelizationNotes
Acknowledgments
The work was supported by the Polish National Science Centre under the project DEC-2013/09/B/ST6/03117 (SD, ADG) and by Silesian University of Technology grant no. BKM507/RAU2/2016 (MK).
References
- 1.MCSTL: The multi-core standard template library (2008). http://algo2.iti.kit.edu/singler/mcstl/
- 2.Cho, M., Brand, D., Bordawekar, R., Finkler, U., Kulandaisamy, V., Puri, R.: PARADIS: an efficient parallel algorithm for in-place radix sort. In: Proceedings of the VLDB Endowment—Proceedings of the 41st International Conference on Very, pp. 1518–1529 (2015)Google Scholar
- 3.Deorowicz, S., Kokot, M., Grabowski, S., Debudaj-Grabysz, A.: KMC 2: fast and resource-frugal k-mer counting. Bioinformatics 31(10), 1569–1576 (2015). http://dx.doi.org/10.1093/bioinformatics/btv022 CrossRefGoogle Scholar
- 4.Deorowicz, S., Debudaj-Grabysz, A., Grabowski, S.: Disk-based k-mer counting on a PC. BMC Bioinform. 14(1), 160 (2013). http://dx.doi.org/10.1186/1471-2105-14-160 CrossRefGoogle Scholar
- 5.Gray, J., Sundaresan, P., Englert, S., Baclawski, K., Weinberger, P.: Quickly generating billion-record synthetic databases. In: Proceedings of the SIGMOD, pp. 243–252 (1994)Google Scholar
- 6.Hoare, C.: Quicksort. Comput. J. 5(1), 10–15 (1962)MathSciNetMATHCrossRefGoogle Scholar
- 7.Intel: Intel Guide for Developing Multithreaded Application, Intel (2011). http://www.intel.com/software/threading-guide
- 8.Intel: Threading Building Blocks (2016). https://www.threadingbuildingblocks.org/
- 9.Knuth, D.: The Art of Computer Programming. Addison-Wesley, Boston (1968)MATHGoogle Scholar
- 10.Musser, D.: Introspective sorting and selection algorithms. Softw.: Pract. Exp. 27(8), 983–993 (1997)Google Scholar
- 11.Satish, N., Kim, C., Chhugani, J., Nguyen, AD., Lee, V., Kim, D., Dubey, P.: Fast sort on CPUs and GPUs: a case for bandwidth oblivious simd sort. In: Proceedings of the 2010 International Conference on Management of Data, pp. 351–362 (2010)Google Scholar
- 12.Sedgewick, R.: Algorithms in C++, Parts 1–4: Fundamentals, Data Structure, Sorting, Searching. Addison-Wesley-Longman, Harlow (1998)MATHGoogle Scholar
- 13.Shell, D.: A high-speed sorting procedure. Commun. ACM 2(7), 30–32 (1959)CrossRefGoogle Scholar
- 14.Singler, J., Sanders, P., Putze, F.: MCSTL: the multi-core standard template library. In: Kermarrec, A.-M., Bougé, L., Priol, T. (eds.) Euro-Par 2007. LNCS, vol. 4641, pp. 682–694. Springer, Heidelberg (2007). doi: 10.1007/978-3-540-74466-5_72 CrossRefGoogle Scholar
- 15.Williams, J.: Algorithm 232: Heapsort. Commun. ACM 7(6), 347–348 (1964)Google Scholar