GPU vs FPGA: A Comparative Analysis for Non-standard Precision
FPGAs and GPUs are increasingly used in a range of high performance computing applications. When implementing numerical algorithms on either platform, we can choose to represent operands with different levels of accuracy. A trade-off exists between the numerical accuracy of arithmetic operators and the resources needed to implement them. Where algorithmic requirements for numerical stability are captured in a design description, this trade-off can be exploited to optimize performance by using high-accuracy operators only where they are most required. Support for half and double-double floating point representations allows additional flexibility to achieve this. The aim of this work is to study the language and hardware support, and the achievable peak performance for non-standard precisions on a GPU and an FPGA. A compute intensive program, matrix-matrix multiply, is selected as a benchmark and implemented for various different matrix sizes. The results show that for large-enough matrices, GPUs out-perform FPGA-based implementations but for some smaller matrix sizes, specialized FPGA floating-point operators for half and double-double precision can deliver higher throughput than implementation on a GPU.
KeywordsGPU FPGA High Performance Computing (HPC) Non-standard Precision Half Precision Double-double Precision
Unable to display preview. Download preview PDF.
- 2.NVIDIA Corporation, Santa Clara, U.: Tesla C1060 Computing Processor Board (January 2010)Google Scholar
- 3.Xilinx Corporation: Virtex-6 Family Overview. Technical Report DS150 (January 2012)Google Scholar
- 4.Xilinx Corporation: LogiCORE Floating-Point Operator v5.0. (2011)Google Scholar
- 6.Volkov, V., Demmel, J.W.: Benchmarking GPUs to tune Dense Linear Algebra. In: Proceedings of the 2008 ACM/IEEE conference on Supercomputing, p. 31. IEEE Press (2008)Google Scholar
- 7.NVIDIA Corporation: CUBLAS library v5.5. Technical report (2013)Google Scholar
- 8.NVIDIA Corporation: CUDA library documentation 4.1, http://developer.download.nvidia.com/compute/cuda/4_1/rel/toolkit/docs/online
- 9.Thall, A.: Extended-Precision Floating-Point Numbers for GPU Computation. In: ACM SIGGRAPH 2006 Research posters, p. 52. ACM (2006)Google Scholar
- 10.Lu, M., He, B., Luo, Q.: Supporting Extended Precision on Graphics Processors. In: Proceedings of the Sixth International Workshop on Data Management on New Hardware, pp. 19–26. ACM (2010)Google Scholar
- 11.Minhas, U.: GPU vs FPGA: A Comparative Performance Analysis for Non-Standard Precision. Master’s thesis, Imperial College London (2013)Google Scholar