Springer Nature is making SARS-CoV-2 and COVID-19 research free. View research | View latest news | Sign up for updates

Implementation of multi-precision floating point divider for high speed signal processing applications

  • 93 Accesses


The introduction of fused multiplier and add technique enhances design metrics of floating point (FP) arithmetic operations. Particularly Newton–Raphson (NR) based division algorithm is popular choice to SRT division algorithms. This paper proposes 32-bit FP division using NR computational division technique with pipelining method capable of doing high speed signal processing operations. The proposed divider used as IP core and make it as optimal choice to speed up FP operations. It improves rounding accuracy in addition with reduction in area overhead. The NR computational calculations are implemented by iteratively using 32-bit FP multiplier and adder. The key module used for calculating significand (mantissa) part is 24-bit Wallace tree multiplier using carry save adders. The Wallace multiplier provides higher computational speed, hence, is effectively utilized as a part of FP divider. The proposed pipelined FP divider is fully combinational circuit with clock and data gating applied to reduce the dynamic power consumption, delay and area overhead designed for signal processing applications. It also improves the accuracy of the result. The operands are represented and operated using IEEE 754 standard. The operation and results are validated through simulation using VIVADO software and implemented on Xilinx-7 series, ARTIX field programmable gate array.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15


  1. 1.

    IEEE Computer Society (2008) IEEE Standard for Floating-point Arithmetic. IEEE standard 754-2008, August 2008.

  2. 2.

    Pimentel JJ, Bohnenstiehl B, Bass BM (2017) Hybrid hardware/software floating-point implementations for optimized area and throughput tradeoffs. IEEE Trans Very Large Scale Integr Syst 25(1):100–113

  3. 3.

    Jaiswal MK, So HK-H (2017) Area-efficient architecture for dual-mode double precision floating point division. IEEE Trans Circuits Syst-I 64(2):386–398

  4. 4.

    Escalante LP, Parra-Michel R, Castillo J, Gandara O (2015) Fast bit accurate reciprocal square root. Microprocessors Microsyst 39:74–82

  5. 5.

    Joldes M, Marty O, Muller J-M, Popescu V (2016) Arithmetic algorithms for extended precision using floating point expansions. IEEE Trans Comput 65(4):1197–1211

  6. 6.

    Nannarelli A (2017) A multi-format floating point multiplier for power efficient operations. In: Proceedings of 30th IEEE International System-on-Chip Conference, Munich, pp 351–356

  7. 7.

    Liu W, Chen L, Wang C, O’Neill M, Lombardi F (2016) Design and analysis of inexact floating-point adders. IEEE Trans Comput 65(1):308–314

  8. 8.

    Jaiswal MK, Varma BSC, So HK-H, Balakrishnan M et al (2015) Configurable architectures for multi-mode floating point adders. IEEE Trans Circuits Syst 62(8):2079–2090

  9. 9.

    Zhang H, Chen D, Ko S-B (2017) Area and power efficient iterative single/double precision merged floating point multiplier on FPGA. IET Comput Digit Tech 11(4):149–158

  10. 10.

    Burgess N, Hinds CN (2007) Design of the ARM VFP11 divide and square root synthesizable macrocell. In: 18th IEEE Symposium on Computer Arithmetic, Montpellier, France, pp 87–96

  11. 11.

    Oberman SF (1999) Floating-point division and square root algorithms and implementation in the AMD- K7 microprocessor. In: 14th IEEE Symposium on Computer Arithmetic, Adelaide, pp 106–115

  12. 12.

    Sharangpani H, Arora H (2000) Itanium processor microarchitecture. IEEE Micro 20(5):24–43

  13. 13.

    Mohammad K, Agaian S, Hudson F (2010) Implementation of digital electric arithmetic and its applications in image processing. Comput Electr Eng 36:424–434

  14. 14.

    Goldberg R, Even G, Seidel PM (2007) An FPGA implementation of pipelined multiplicative division with IEEE rounding. In: 15th IEEE Symposium on Field-Programmable Custom Computing Machines, pp 185–196

  15. 15.

    Govindu G, Scrafano R, Prasanna VK (2005) A library of parameterizable FP cores for FPGAs and their applications to scientific computing. IEEE Trans Comput 54(7):115–125

  16. 16.

    Ayala H, Munoz D, Llanos C, Coelho L (2017) Efficient hardware implementation of radial basis function neural network with customized–precision floating-point operations. Control Eng Pract 60:124–132

  17. 17.

    Zhu B, Lei Y, Peng Y, He T (2017) Low latency and low error floating point sine/cosine function based TCORDIC algorithm. IEEE Trans Circuits Syst 64(4):892–905

  18. 18.

    Jaiswal MK, So HKH (2018) An unified architecture for single, double, double-extended and quadruple precision division. Circuits Syst Signal Process 37(1):383–407

  19. 19.

    Gilani SZ, Kim NS, Shulte M (2011) Energy efficient floating point arithmetic for digital signal processors. In: Proceedings of 45th IEEE Asilomar Conference on Signal, System and Computers, CA, USA, pp 251–256

  20. 20.

    Joshi MN, Gowali DH (2016) Floating point unit core for signal processing applications. In: IEEE Conference on Green Engineering and Technologies, Coimbatore, India, pp 5–10

  21. 21.

    Kwon T-J, Sandeen J, Draper J (2005) Design trade-offs in floating point unit implementation for embedded and processing-in-memory systems. In: 14th IEEE Symposium on Circuits and Systems, Kobe, Japan, pp 3331–3334

  22. 22.

    Oberman SF, Flynn MJ (1997) Division algorithms and implementations. IEEE Trans Comput 46(8):833–854

  23. 23.

    Kornerup P, Muller J-M (2006) Choosing starting values for certain Newton-Raphson iterations. Theor Comput Sci 351:101–110

  24. 24.

    Parker A, Hamblen JO (1992) Optimal value for the Newton–Raphson division algorithm. Inf Process Lett 42(3):141–144

  25. 25.

    Liu W, Nannarelli A (2013) Power efficient division and square root unit. IEEE Trans Comput 61(8):1059–1071

  26. 26.

    Oberman SF, Flynn MJ (1997) Design issues in division and other floating point operations. IEEE Trans Comput 46(2):833–854

  27. 27.

    Masaudnia A, Sarbazi-Azad H, Boussakla S (2005) Design and performance of a pixel-pixel pipelined-parallel arch for high speed wavelet based image compression. Comput Electr Eng 31:572–588

  28. 28.

    Scott Hemmert K, Underwood KD (2007) Floating-point divider design for FPGAs. IEEE Trans Very Large Scale Integr (VLSI) Syst 15(1):115–118

  29. 29.

    Huang S, Han F-J, Luo Y (2017) A pipelined architecture for user-defined floating point complex division on FPGA. In: 30th IEEE Canadian Conference on Electrical and Computer Engineering, pp 571–574

Download references

Author information

Correspondence to C. R. S. Hanuman.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Hanuman, C.R.S., Kamala, J. & Aruna, A.R. Implementation of multi-precision floating point divider for high speed signal processing applications. J Supercomput 75, 6038–6054 (2019).

Download citation


  • Floating point arithmetic
  • Wallace tree
  • Newton–Raphson
  • Signal processing