Skip to main content

VLSI Implementation of Double-Precision Floating-Point Multiplier Using Karatsuba Technique


The double-precision floating-point arithmetic, specifically multiplication, is a widely used arithmetic operation for many scientific and signal processing applications. In general, the double-precision floating-point multiplier requires a large 53×53 mantissa multiplication in order to get the final result. This mantissa multiplication exists as a limit on both area and performance bounds of this operation. This paper presents a novel way to reduce this large multiplication. The proposed approach in this paper allows to use less amount of multiplication hardware compared to the traditional method. The multiplication is done by using Karatsuba technique. This design is specifically targeting Field Programmable Gate Array (FPGA) platforms, and it has also been evaluated on ASIC flow. The proposed module gives excellent performance with efficient use of resources. The design is fully compatible with the IEEE standard precision. The proposed module has shown a better performance in comparison with the best reported multipliers in the literature.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Algorithm 1
Algorithm 2


  1. V. Aggarwal, A.D. George, K.C. Slatton, Reconfigurable computing with multiscale data fusion for remote sensing, in Proceedings of the 2006 ACM/SIGDA 14th International Symposium on Field Programmable Gate Arrays (FPGA-06) (ACM, New York, 2006), p. 235. doi:10.1145/1117201.1117261

    Google Scholar 

  2. S. Banescu, F. de Dinechin, B. Pasca, R. Tudoran, Multipliers for floating-point double precision and beyond on FPGAs. Comput. Archit. News 38, 73–79 (2011). doi:10.1145/1926367.1926380

    Article  Google Scholar 

  3. P. Belanovic, M. Leeser, A library of parameterized floating-point modules and their use, in 12th International Conference on Field-Programmable Logic and Applications (FPL-02) (Springer, London, 2002), pp. 657–666

    Google Scholar 

  4. W. Chelton, M. Benaissa, Fast elliptic curve cryptography on FPGA. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 16(2), 198–205 (2008). doi:10.1109/TVLSI.2007.912228

    Article  Google Scholar 

  5. S.A. Cook, On the minimum computation time of functions, Ph.D. thesis, Harvard University, Department of Mathematics, 1966,

  6. Cray XD1 Supercomputers (2008).

  7. F. de Dinechin, B. Pasca, Large multipliers with fewer DSP blocks, in International Conference on Field Programmable Logic and Applications (2009), pp. 250–255. doi:10.1109/FPL.2009.5272296

    Google Scholar 

  8. D. Goldberg, What every computer scientist should know about floating-point arithmetic. ACM Comput. Surv. 23(1), 5–48 (1991). doi:10.1145/103162.103163

    Article  Google Scholar 

  9. Z. Guo, W. Najjar, F. Vahid, K. Vissers, A quantitative analysis of the speedup factors of FPGAs over processors, in Proceedings of the 2004 ACM/SIGDA 12th International Symposium on Field Programmable Gate Arrays (FPGA-04) (ACM, New York, 2004), pp. 162–170. doi:10.1145/968280.968304

    Chapter  Google Scholar 

  10. K.S. Hemmert, K.D. Underwood, Open source high performance floating-point modules, in 14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM-06) (2006), pp. 349–350. doi:10.1109/FCCM.2006.54

    Google Scholar 

  11. K.S. Hemmert, K.D. Underwood, Fast, efficient floating point adders and multipliers for FPGAs. ACM Trans. Reconfigurable Technol. Syst. 3(3), 11 (2010)

    Article  Google Scholar 

  12. IEEE, IEEE standard for binary floating-point arithmetic. ANSI/IEEE Std 754-1985 (1985). doi:10.1109/IEEESTD.1985.82928

  13. IEEE, IEEE standard floating-point arithmetic. IEEE Std 754-2008 pp. 1–58 (2008). doi:10.1109/IEEESTD.2008.4610935

  14. M.K. Jaiswal, N. Chandrachoodan, A high performance implementation of LU decomposition on FPGA, in 13th VLSI Design and Test Symposium (VDAT-2009) (2009), pp. 124–134

    Google Scholar 

  15. M.K. Jaiswal, N. Chandrachoodan, FPGA based high performance and scalable block LU decomposition architecture. IEEE Trans. Comput. 61, 60–72 (2012). doi:

    Article  MathSciNet  Google Scholar 

  16. A. Karatsuba, Y. Ofman, Multiplication of many-digital numbers by automatic computers, in Proceedings of the USSR Academy of Sciences, vol. 145 (1962), pp. 293–294

    Google Scholar 

  17. C.H. Kim, S. Kwon, C.P. Hong, FPGA implementation of high performance elliptic curve cryptographic processor over GF(2163). J. Syst. Archit. 54(10), 893–900 (2008). doi:10.1016/j.sysarc.2008.03.005

    Article  Google Scholar 

  18. A. Koohi, N. Bagherzadeh, C. Pan, A fast parallel Reed–Solomon decoder on a reconfigurable architecture, in First IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (2003), pp. 59–64

    Chapter  Google Scholar 

  19. M. Leeser, VFloat: the northeastern variable precision floating point library (2008),

  20. G. Lienhart, A. Kugel, R. Manner, Using floating-point arithmetic on FPGAs to accelerate scientific n-body simulations, in 10th Annual IEEE Symposium on Field-Programable Custom Computing Machines (FCCM’02) (IEEE Comput. Soc., Los Alamitos, 2002)

    Google Scholar 

  21. H. Parizi, A. Niktash, A. Kamalizad, N. Bagherzadeh, A reconfigurable architecture for wireless communication systems, in Third International Conference on Information Technology: New Generations (2006), pp. 250–255. doi:

    Chapter  Google Scholar 

  22. S. Paschalakis, P. Lee, Double precision floating-point arithmetic on FPGAs, in 2nd IEEE International Conference on Field Programmable Technology (FPT’03) (2003), pp. 352–358

    Google Scholar 

  23. SGI Supercomputers.

  24. M. Smith, J. Vetter, X. Liang, Accelerating scientific applications with the SRC-6 reconfigurable computer: methodologies and analysis, in Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (2005), p. 157b

    Chapter  Google Scholar 

  25. SRC Supercomputers (2008).

  26. O. Storaasli, R.C. Singleterry, S. Brown, Scientific Computation on a NASA Reconfigurable Hypercomputer (2002)

  27. A.L. Toom, The complexity of a scheme of functional elements realizing the multiplication of integers, in Soviet Math., vol. 4 (1963), p. 4. (translations of Dokl. Adad. Nauk SSSR).

    Google Scholar 

  28. S. Venishetti, A. Akoglu, A highly parallel FPGA based IEEE-754 compliant double-precision binary floating-point multiplication algorithm, in International Conference on Field-Programmable Technology (ICFPT 2007) (2007), pp. 145–152. doi:10.1109/FPT.2007.4439243

    Chapter  Google Scholar 

  29. X. Wang, M. Leeser, VFloat: a variable precision fixed- and floating-point library for reconfigurable hardware. ACM Trans. Reconfigurable Technol. Syst. 3, 16 (2010). doi:

    Article  Google Scholar 

  30. Xilinx, Xilinx floating-point IP core.

Download references


The authors would like to thank the anonymous reviewers, whose comments and suggestions helped considerably to improve the paper. This work is supported by the City University of Hong Kong (Project No. 7200179).

Author information

Authors and Affiliations


Corresponding author

Correspondence to Manish Kumar Jaiswal.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Jaiswal, M.K., Cheung, R.C.C. VLSI Implementation of Double-Precision Floating-Point Multiplier Using Karatsuba Technique. Circuits Syst Signal Process 32, 15–27 (2013).

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: