A Block-Based Systolic Array on an HBM2 FPGA for DNA Sequence Alignment
- 37 Downloads
Revealing the optimal local similarity between a pair of genomic sequences is one of the most fundamental issues in bioinformatics. The Smith-Waterman algorithm is a method that was developed for that specific purpose. With the continuous advances in the computer field, this method becomes widely used to an extent where it expanded its reach to cover a broad range of applications, even in areas such as network packet inspections and pattern matching. This algorithm is based on Dynamic Programming and is guaranteed to find the optimal local sequence alignment between two base pairs. The computational complexity is O(mn), where m and n are defined as the number of the elements of a query and a database sequence, respectively. Researchers have investigated several manners to accelerate the calculation using CPU, GPU, Cell B.E., and FPGA. Most of them have proposed a data-reuse approach because the Smith-Waterman algorithm has rather high “bytes per operation”; in other words, the Smith-Waterman algorithm requires large memory bandwidth. In this paper, we try to minimize the impact of the memory bandwidth bottleneck through the implementation of a block-based systolic array approach that maximizes the usage of memory banks in HBM2 (High Bandwidth Memory). The proposed approach demonstrates a higher performance in terms of GCUPS (Giga Cell Update Per Second) compared to one of the best cases reported in previous works, and also achieves a significant improvement in power efficiency. For example, our implementation could reach 429.39 GCUPS while achieving a power efficiency of 7.68 GCUPS/W. With a different configuration, it could reach 316.73 GCUPS while hitting a peak power efficiency of 8.86 GCUPS/W.
KeywordsDNA sequence alignment Smith-Waterman algorithm Systolic array HBM2 High Level Synthesis Reconfigurable High Performance Computing
This work was supported in part by MEXT as “Next Generation High-Performance Computing Infrastructures and Applications R&D Program” (Development of Computing-Communication Unified Supercomputer in Next Generation), and by JSPS KAKENHI Grant Number JP17H01707 and JP18H03246. The authors would also like to thank Xilinx Inc., for providing FPGA software tools by Xilinx University Program.
- 2.Chen, P., Wang, C., Li, X., Zhou, X.: Hardware acceleration for the banded Smith-Waterman algorithm with the cycled systolic array, pp. 480–481, December 2013Google Scholar
- 4.Di Tucci, L., O’Brien, K., Blott, M., Santambrogio, M.: Architectural optimizations for high performance and energy efficient Smith-Waterman implementation on FPGAS using OpenCL, pp. 716–721, March 2017. https://doi.org/10.23919/DATE.2017.7927082
- 6.Hasan, L., Khawaja, Y., Bais, A.: A systolic array architecture for the Smith-Waterman algorithm with high performance cell design, pp. 35–44, January 2008Google Scholar
- 7.Houtgast, E., Sima, V., Al-Ars, Z.: High performance streaming Smith-Waterman implementation with implicit synchronization on intel FPGA using OpenCL, December 2017 Google Scholar
- 9.Liu, Y., et al.: Cudasw++ 3.0: accelerating Smith-Waterman protein database search by coupling CPU and GPU SIMD instructions. BMC Bioinform. 14 (2013). Article no. 117, https://doi.org/10.1186/1471-2105-14-117
- 11.Nurdin, D., et al.: High performance systolic array core architecture design for DNA sequencer. MATEC Web Conf. 150 (2018). Article no. 06009Google Scholar
- 12.Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelisation. BMC Bioinform. 12 (2011). https://doi.org/10.1186/1471-2105-12-221
- 15.Sandes, E., et al.: CUDAlign 3.0: parallel biological sequence comparison in large GPU clusters, pp. 160–169, May 2014Google Scholar
- 18.Xilinx: Alveo U280 Data Center Accelerator Card. https://www.xilinx.com/products/boards-and-kits/alveo/u280.html#specifications. Accessed 8 Dec 2019
- 19.Xilinx: Vivado HLS Optimization Methodology Guide. https://www.xilinx.com/support/documentation/sw_manuals/xilinx2018_1/ug1270-vivado-hls-opt-methodology-guide.pdf. Accessed 8 Dec 2019
- 20.Yamaguchi, Y., Tsoi, H.K., Luk, W.: FPGA-based Smith-Waterman algorithm: analysis and novel design. In: Koch, A., Krishnamurthy, R., McAllister, J., Woods, R., El-Ghazawi, T. (eds.) ARC 2011. LNCS, vol. 6578, pp. 181–192. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-19475-7_20CrossRefGoogle Scholar