The introduction of fused multiplier and add technique enhances design metrics of floating point (FP) arithmetic operations. Particularly Newton–Raphson (NR) based division algorithm is popular choice to SRT division algorithms. This paper proposes 32-bit FP division using NR computational division technique with pipelining method capable of doing high speed signal processing operations. The proposed divider used as IP core and make it as optimal choice to speed up FP operations. It improves rounding accuracy in addition with reduction in area overhead. The NR computational calculations are implemented by iteratively using 32-bit FP multiplier and adder. The key module used for calculating significand (mantissa) part is 24-bit Wallace tree multiplier using carry save adders. The Wallace multiplier provides higher computational speed, hence, is effectively utilized as a part of FP divider. The proposed pipelined FP divider is fully combinational circuit with clock and data gating applied to reduce the dynamic power consumption, delay and area overhead designed for signal processing applications. It also improves the accuracy of the result. The operands are represented and operated using IEEE 754 standard. The operation and results are validated through simulation using VIVADO software and implemented on Xilinx-7 series, ARTIX field programmable gate array.
This is a preview of subscription content, log in to check access.
Buy single article
Instant access to the full article PDF.
Price includes VAT for USA
Subscribe to journal
Immediate online access to all issues from 2019. Subscription will auto renew annually.
This is the net price. Taxes to be calculated in checkout.
IEEE Computer Society (2008) IEEE Standard for Floating-point Arithmetic. IEEE standard 754-2008, August 2008. http://ieeexplore.ieee.org/servlet/opac?punumber=4610933
Pimentel JJ, Bohnenstiehl B, Bass BM (2017) Hybrid hardware/software floating-point implementations for optimized area and throughput tradeoffs. IEEE Trans Very Large Scale Integr Syst 25(1):100–113
Jaiswal MK, So HK-H (2017) Area-efficient architecture for dual-mode double precision floating point division. IEEE Trans Circuits Syst-I 64(2):386–398
Escalante LP, Parra-Michel R, Castillo J, Gandara O (2015) Fast bit accurate reciprocal square root. Microprocessors Microsyst 39:74–82
Joldes M, Marty O, Muller J-M, Popescu V (2016) Arithmetic algorithms for extended precision using floating point expansions. IEEE Trans Comput 65(4):1197–1211
Nannarelli A (2017) A multi-format floating point multiplier for power efficient operations. In: Proceedings of 30th IEEE International System-on-Chip Conference, Munich, pp 351–356
Liu W, Chen L, Wang C, O’Neill M, Lombardi F (2016) Design and analysis of inexact floating-point adders. IEEE Trans Comput 65(1):308–314
Jaiswal MK, Varma BSC, So HK-H, Balakrishnan M et al (2015) Configurable architectures for multi-mode floating point adders. IEEE Trans Circuits Syst 62(8):2079–2090
Zhang H, Chen D, Ko S-B (2017) Area and power efficient iterative single/double precision merged floating point multiplier on FPGA. IET Comput Digit Tech 11(4):149–158
Burgess N, Hinds CN (2007) Design of the ARM VFP11 divide and square root synthesizable macrocell. In: 18th IEEE Symposium on Computer Arithmetic, Montpellier, France, pp 87–96
Oberman SF (1999) Floating-point division and square root algorithms and implementation in the AMD- K7 microprocessor. In: 14th IEEE Symposium on Computer Arithmetic, Adelaide, pp 106–115
Sharangpani H, Arora H (2000) Itanium processor microarchitecture. IEEE Micro 20(5):24–43
Mohammad K, Agaian S, Hudson F (2010) Implementation of digital electric arithmetic and its applications in image processing. Comput Electr Eng 36:424–434
Goldberg R, Even G, Seidel PM (2007) An FPGA implementation of pipelined multiplicative division with IEEE rounding. In: 15th IEEE Symposium on Field-Programmable Custom Computing Machines, pp 185–196
Govindu G, Scrafano R, Prasanna VK (2005) A library of parameterizable FP cores for FPGAs and their applications to scientific computing. IEEE Trans Comput 54(7):115–125
Ayala H, Munoz D, Llanos C, Coelho L (2017) Efficient hardware implementation of radial basis function neural network with customized–precision floating-point operations. Control Eng Pract 60:124–132
Zhu B, Lei Y, Peng Y, He T (2017) Low latency and low error floating point sine/cosine function based TCORDIC algorithm. IEEE Trans Circuits Syst 64(4):892–905
Jaiswal MK, So HKH (2018) An unified architecture for single, double, double-extended and quadruple precision division. Circuits Syst Signal Process 37(1):383–407
Gilani SZ, Kim NS, Shulte M (2011) Energy efficient floating point arithmetic for digital signal processors. In: Proceedings of 45th IEEE Asilomar Conference on Signal, System and Computers, CA, USA, pp 251–256
Joshi MN, Gowali DH (2016) Floating point unit core for signal processing applications. In: IEEE Conference on Green Engineering and Technologies, Coimbatore, India, pp 5–10
Kwon T-J, Sandeen J, Draper J (2005) Design trade-offs in floating point unit implementation for embedded and processing-in-memory systems. In: 14th IEEE Symposium on Circuits and Systems, Kobe, Japan, pp 3331–3334
Oberman SF, Flynn MJ (1997) Division algorithms and implementations. IEEE Trans Comput 46(8):833–854
Kornerup P, Muller J-M (2006) Choosing starting values for certain Newton-Raphson iterations. Theor Comput Sci 351:101–110
Parker A, Hamblen JO (1992) Optimal value for the Newton–Raphson division algorithm. Inf Process Lett 42(3):141–144
Liu W, Nannarelli A (2013) Power efficient division and square root unit. IEEE Trans Comput 61(8):1059–1071
Oberman SF, Flynn MJ (1997) Design issues in division and other floating point operations. IEEE Trans Comput 46(2):833–854
Masaudnia A, Sarbazi-Azad H, Boussakla S (2005) Design and performance of a pixel-pixel pipelined-parallel arch for high speed wavelet based image compression. Comput Electr Eng 31:572–588
Scott Hemmert K, Underwood KD (2007) Floating-point divider design for FPGAs. IEEE Trans Very Large Scale Integr (VLSI) Syst 15(1):115–118
Huang S, Han F-J, Luo Y (2017) A pipelined architecture for user-defined floating point complex division on FPGA. In: 30th IEEE Canadian Conference on Electrical and Computer Engineering, pp 571–574
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Hanuman, C.R.S., Kamala, J. & Aruna, A.R. Implementation of multi-precision floating point divider for high speed signal processing applications. J Supercomput 75, 6038–6054 (2019). https://doi.org/10.1007/s11227-019-02902-w
- Floating point arithmetic
- Wallace tree
- Signal processing