The double-precision floating-point arithmetic, specifically multiplication, is a widely used arithmetic operation for many scientific and signal processing applications. In general, the double-precision floating-point multiplier requires a large 53×53 mantissa multiplication in order to get the final result. This mantissa multiplication exists as a limit on both area and performance bounds of this operation. This paper presents a novel way to reduce this large multiplication. The proposed approach in this paper allows to use less amount of multiplication hardware compared to the traditional method. The multiplication is done by using Karatsuba technique. This design is specifically targeting Field Programmable Gate Array (FPGA) platforms, and it has also been evaluated on ASIC flow. The proposed module gives excellent performance with efficient use of resources. The design is fully compatible with the IEEE standard precision. The proposed module has shown a better performance in comparison with the best reported multipliers in the literature.
This is a preview of subscription content, access via your institution.
Buy single article
Instant access to the full article PDF.
Price excludes VAT (USA)
Tax calculation will be finalised during checkout.
V. Aggarwal, A.D. George, K.C. Slatton, Reconfigurable computing with multiscale data fusion for remote sensing, in Proceedings of the 2006 ACM/SIGDA 14th International Symposium on Field Programmable Gate Arrays (FPGA-06) (ACM, New York, 2006), p. 235. doi:10.1145/1117201.1117261
S. Banescu, F. de Dinechin, B. Pasca, R. Tudoran, Multipliers for floating-point double precision and beyond on FPGAs. Comput. Archit. News 38, 73–79 (2011). doi:10.1145/1926367.1926380
P. Belanovic, M. Leeser, A library of parameterized floating-point modules and their use, in 12th International Conference on Field-Programmable Logic and Applications (FPL-02) (Springer, London, 2002), pp. 657–666
W. Chelton, M. Benaissa, Fast elliptic curve cryptography on FPGA. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 16(2), 198–205 (2008). doi:10.1109/TVLSI.2007.912228
S.A. Cook, On the minimum computation time of functions, Ph.D. thesis, Harvard University, Department of Mathematics, 1966, http://cr.yp.to/bib/1966/cook.html
Cray XD1 Supercomputers (2008). http://www.cray.com/
F. de Dinechin, B. Pasca, Large multipliers with fewer DSP blocks, in International Conference on Field Programmable Logic and Applications (2009), pp. 250–255. doi:10.1109/FPL.2009.5272296
D. Goldberg, What every computer scientist should know about floating-point arithmetic. ACM Comput. Surv. 23(1), 5–48 (1991). doi:10.1145/103162.103163
Z. Guo, W. Najjar, F. Vahid, K. Vissers, A quantitative analysis of the speedup factors of FPGAs over processors, in Proceedings of the 2004 ACM/SIGDA 12th International Symposium on Field Programmable Gate Arrays (FPGA-04) (ACM, New York, 2004), pp. 162–170. doi:10.1145/968280.968304
K.S. Hemmert, K.D. Underwood, Open source high performance floating-point modules, in 14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM-06) (2006), pp. 349–350. doi:10.1109/FCCM.2006.54
K.S. Hemmert, K.D. Underwood, Fast, efficient floating point adders and multipliers for FPGAs. ACM Trans. Reconfigurable Technol. Syst. 3(3), 11 (2010)
IEEE, IEEE standard for binary floating-point arithmetic. ANSI/IEEE Std 754-1985 (1985). doi:10.1109/IEEESTD.1985.82928
IEEE, IEEE standard floating-point arithmetic. IEEE Std 754-2008 pp. 1–58 (2008). doi:10.1109/IEEESTD.2008.4610935
M.K. Jaiswal, N. Chandrachoodan, A high performance implementation of LU decomposition on FPGA, in 13th VLSI Design and Test Symposium (VDAT-2009) (2009), pp. 124–134
M.K. Jaiswal, N. Chandrachoodan, FPGA based high performance and scalable block LU decomposition architecture. IEEE Trans. Comput. 61, 60–72 (2012). doi:http://doi.ieeecomputersociety.org/10.1109/TC.2011.24
A. Karatsuba, Y. Ofman, Multiplication of many-digital numbers by automatic computers, in Proceedings of the USSR Academy of Sciences, vol. 145 (1962), pp. 293–294
C.H. Kim, S. Kwon, C.P. Hong, FPGA implementation of high performance elliptic curve cryptographic processor over GF(2163). J. Syst. Archit. 54(10), 893–900 (2008). doi:10.1016/j.sysarc.2008.03.005
A. Koohi, N. Bagherzadeh, C. Pan, A fast parallel Reed–Solomon decoder on a reconfigurable architecture, in First IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (2003), pp. 59–64
M. Leeser, VFloat: the northeastern variable precision floating point library (2008), http://www.ece.neu.edu/groups/rpl/projects/floatingpoint/
G. Lienhart, A. Kugel, R. Manner, Using floating-point arithmetic on FPGAs to accelerate scientific n-body simulations, in 10th Annual IEEE Symposium on Field-Programable Custom Computing Machines (FCCM’02) (IEEE Comput. Soc., Los Alamitos, 2002)
H. Parizi, A. Niktash, A. Kamalizad, N. Bagherzadeh, A reconfigurable architecture for wireless communication systems, in Third International Conference on Information Technology: New Generations (2006), pp. 250–255. doi:http://doi.ieeecomputersociety.org/10.1109/ITNG.2006.16
S. Paschalakis, P. Lee, Double precision floating-point arithmetic on FPGAs, in 2nd IEEE International Conference on Field Programmable Technology (FPT’03) (2003), pp. 352–358
SGI Supercomputers. http://www.sgi.com/
M. Smith, J. Vetter, X. Liang, Accelerating scientific applications with the SRC-6 reconfigurable computer: methodologies and analysis, in Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (2005), p. 157b
SRC Supercomputers (2008). http://www.srccomp.com/
O. Storaasli, R.C. Singleterry, S. Brown, Scientific Computation on a NASA Reconfigurable Hypercomputer (2002)
A.L. Toom, The complexity of a scheme of functional elements realizing the multiplication of integers, in Soviet Math., vol. 4 (1963), p. 4. (translations of Dokl. Adad. Nauk SSSR). http://www.de.ufpe.br/~toom/articles/rusmat/Multipli.pdf
S. Venishetti, A. Akoglu, A highly parallel FPGA based IEEE-754 compliant double-precision binary floating-point multiplication algorithm, in International Conference on Field-Programmable Technology (ICFPT 2007) (2007), pp. 145–152. doi:10.1109/FPT.2007.4439243
X. Wang, M. Leeser, VFloat: a variable precision fixed- and floating-point library for reconfigurable hardware. ACM Trans. Reconfigurable Technol. Syst. 3, 16 (2010). doi:http://doi.acm.org/10.1145/1839480.1839486
Xilinx, Xilinx floating-point IP core. http://www.xilinx.com
The authors would like to thank the anonymous reviewers, whose comments and suggestions helped considerably to improve the paper. This work is supported by the City University of Hong Kong (Project No. 7200179).
About this article
Cite this article
Jaiswal, M.K., Cheung, R.C.C. VLSI Implementation of Double-Precision Floating-Point Multiplier Using Karatsuba Technique. Circuits Syst Signal Process 32, 15–27 (2013). https://doi.org/10.1007/s00034-012-9457-3