Abstract
Extracting valuable information from the rapidly growing field of Big Data faces serious performance constraints, especially in the software-based database management systems (DBMS). In a query processing system, hash-based computational primitives such as the hash join and the group-by are the most time-consuming operations, as they frequently need to access the hash table on the high-latency off-chip memories and also to traverse whole the table. Subsequently, the hash collision is an inherent issue related to the hash tables, which can adversely degrade the overall performance.
In order to alleviate this problem, in this paper, we present a novel pure hardware-based hash engine, implemented on the FPGA. In order to mitigate the high memory access latencies and also to faster resolve the hash collisions, we follow a novel design point. It is based on caching the hash table entries in the fast on-chip Block-RAMs of FPGA. Faster accesses to the correspondent hash table entries from the cache can lead to an improved overall performance.
We evaluate the proposed approach by running hash-based table join and group-by operations of 5 TPC-H benchmark queries. The results show 2.9×–4.4× speedups over the cache-less FPGA-based baseline.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bluespec, Inc. http://bluespec.com/
Netezza. The Netezza FAST engines framework. http://www.monash.com/uploads/netezza-fpga.pdf
Casper, J., Olukotun, K.: Hardware acceleration of database operations. In: Proceedings of the 2014 ACM/SIGDA International Symposium on Field-programmable Gate Arrays, pp. 151–160. ACM (2014)
Chung, E.S., Davis, J.D., Lee, J.: LINQits: big data on little clients. ACM SIGARCH Comput. Archit. News 41, 261–272 (2013)
De, A., et al.: Minerva: accelerating data analysis in next-generation SSDs. In: 2013 IEEE 21st Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM), pp. 9–16. IEEE (2013)
Dennl, C., Ziener, D., Teich, J.: On-the-fly composition of FPGA-based SQL query accelerators using a partially reconfigurable module library. In: 2012 IEEE 20th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM), pp. 45–52. IEEE (2012)
Halstead, R.J., et al.: FPGA-based multithreading for in-memory hash joins. In: Biennial Conference of Innovative Data Systems Research (CIDR) (2015)
Halstead, R.J., et al.: Accelerating join operation for relational databases with FPGAs. In: 2013 IEEE 21st Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM), pp. 17–20. IEEE (2013)
Hayes, T., et al.: Vector extensions for decision support DBMS acceleration. In: 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture, pp. 166–176. IEEE (2012)
He, J., Lu, M., He, B.: Revisiting co-processing for hash joins on the coupled CPU-GPU architecture. Proc. VLDB Endow. 6(10), 889–900 (2013)
István, Z., et al.: A flexible hash table design for 10GBPS key-value stores on FPGAs. In: 2013 23rd International Conference on Field Programmable Logic and Applications, pp. 1–8. IEEE (2013)
Kocberber, O., et al.: Meet the walkers: accelerating index traversals for in-memory databases. In: Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture, pp. 468–479. ACM (2013)
Koch, D., Torresen, J.: FPGASort: a high performance sorting architecture exploiting run-time reconfiguration on FPGAs for large problem sorting. In: Proceedings of the 19th ACM/SIGDA International Symposium on Field programmable Gate Arrays, pp. 45–54. ACM (2011)
Krishnamurthy, R., et al.: Methods and systems for generating query plans that are compatible for execution in hardware. U.S. Patent Application No. 12/168,821, 7 July 2008
Mueller, R., Teubner, J., Alonso, G.: Data processing on FPGAs. Proc. VLDB Endow. 2(1), 910–921 (2009)
Oge, Y., et al.: An implementation of handshake join on FPGA. In: 2011 Second International Conference on Networking and Computing (ICNC), pp. 95–104. IEEE (2011)
Woods, L., Teubner, J., Alonso, G.: Less watts, more performance: an intelligent storage engine for data appliances. In: Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data, pp. 1073–1076. ACM (2013)
Wu, L., et al.: Q100: the architecture and design of a database processing unit. ACM SIGPLAN Not. 49(4), 255–268 (2014)
Zeller, H., Gray, J.: An adaptive hash join algorithm for multiuser environments. In: VLDB, pp. 186–197 (1990)
Blott, M., et al.: Achieving 10Gbps line-rate key-value stores with FPGAs. In: Presented as part of the 5th USENIX Workshop on Hot Topics in Cloud Computing (2013)
Roy, P., Teubner, J., Gemulla, R.: Low-latency handshake join. Proc. VLDB Endow. 7(9), 709–720 (2014)
Latest version of PostgreSQL 5.3. https://2ndquadrant.com/en/
TPC-H benchmark set. http://www.tpc.org/tpch/
Hayes, T., et al.: Future vector microprocessor extensions for data aggregations. In: Proceedings of the 43rd International Symposium on Computer Architecture, pp. 418–430. IEEE Press (2016)
Arcas-Abella, O., et al.: Hardware acceleration for query processing: leveraging FPGAs, CPUs, and memory. Comput. Sci. Eng. 18(1), 80–87 (2016)
Salami, B., Arcas-Abella, O., Sonmez, N.: HATCH: hash table caching in hardware for efficient relational join on FPGA. In: 2015 IEEE 23rd Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM), p. 163. IEEE (2015)
Acknowledgments
The research leading to these results has received funding from the European Union’s Seventh Framework Program (FP7/2007-2013), for Advanced Analytics for Extremely Large European Databases (AXLE) project under grant agreement number 318633, and from the Ministry of Economy and Competitiveness of Spain under contract number TIN2015-65316-p.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Salami, B., Arcas-Abella, O., Sonmez, N., Unsal, O., Kestelman, A.C. (2017). Accelerating Hash-Based Query Processing Operations on FPGAs by a Hash Table Caching Technique. In: Barrios Hernández, C., Gitler, I., Klapp, J. (eds) High Performance Computing. CARLA 2016. Communications in Computer and Information Science, vol 697. Springer, Cham. https://doi.org/10.1007/978-3-319-57972-6_10
Download citation
DOI: https://doi.org/10.1007/978-3-319-57972-6_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-57971-9
Online ISBN: 978-3-319-57972-6
eBook Packages: Computer ScienceComputer Science (R0)