Skip to main content

Specialization and Fusion of Floating-Point Operators

  • Chapter
  • First Online:
Application-Specific Arithmetic

Abstract

This chapter studies the possible optimizations that arise in the specialization or fusion of floating-point operators. It builds upon the specialized fixed operators of previous chapters and focuses on exponent management and rounding issues specific to floating point.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 139.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 179.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Strictly speaking, as the exponent is constant, the point does not actually float here.

  2. 2.

    Of course, trusting the compiler to respect language standards is necessary but not sufficient to ensure safe applications.

References

  1. Nicolas Brisebarre, Florent de Dinechin, and Jean-Michel Muller. “Integer and Floating-Point Constant Multipliers for FPGAs”. In: International Conference on Application-Specific Systems, Architectures and Processors (ASAP). IEEE, 2008, pp. 239–244.

    Google Scholar 

  2. Javier D. Bruguera and Tomás Lang. “Floating-point Fused Multiply-Add: Reduced Latency for Floating-Point Addition”. In: Symposium on Computer Arithmetic (ARITH). IEEE, 2005.

    Google Scholar 

  3. Nicolas Brisebarre and Jean-Michel Muller. “Correctly Rounded Multiplication by Arbitrary Precision Constants”. In: IEEE Transactions on Computers 57.2 (2008), pp. 165–174.

    Google Scholar 

  4. Marius Cornea, John Harrison, and Ping Tak Peter Tang. Scientific Computing on Itanium®-Based Systems. Intel Press, 2002.

    Google Scholar 

  5. Vassil Dimitrov, Laurent Imbert, and Andrew Zakaluzny. “Multiplication by a Constant is Sublinear”. In: Symposium on Computer Arithmetic (ARITH). IEEE, 2007, pp. 261–268.

    Google Scholar 

  6. Florent de Dinechin, Cristian Klein, and Bogdan Pasca. “Generating High-Performance Custom Floating-Point Pipelines”. In: International Conference on Field-Programmable Logic and Applications (FPL). IEEE, 2009, pp. 59–64.

    Google Scholar 

  7. E. Hokenek, R. K. Montoye, and P. W. Cook. “Second-Generation RISC Floating Point with Multiply-Add Fused”. In: IEEE Journal of Solid-State Circuits 25.5 (1990), pp. 1207–1213.

    Google Scholar 

  8. Tomás Lang and Javier D. Bruguera. “Floating-Point Multiply-Add-Fused with Reduced Latency”. In: IEEE Transactions on Computers 53.8 (2004), pp. 988–1003.

    Google Scholar 

  9. David Lutz. “Fused Multiply-Add Microarchitecture Comprising Separate Early-Normalizing Multiply and Add Pipelines”. In: Symposium on Computer Arithmetic (ARITH). IEEE, 2011, pp. 123–128.

    Google Scholar 

  10. Peter Markstein. IA-64 and Elementary Functions: Speed and Precision. Hewlett-Packard Professional Books. Prentice Hall, 2000.

    Google Scholar 

  11. R. K. Montoye, E. Hokonek, and S. L. Runyan. “Design of the IBM RISC System/6000 floating-point execution unit”. In: IBM Journal of Research and Development 34.1 (1990), pp. 59–70.

    Google Scholar 

  12. Jae Hong Min and Earl E. Swartzlander. “Fused Floating-Point Two-term Sum-of-Squares Unit”. In: Application-Specific Systems, Architectures and Processors (ASAP). IEEE, 2013.

    Google Scholar 

  13. Steven S. Muchnick. Advanced Compiler Design and Implementation. Morgan Kaufmann, 1997.

    Google Scholar 

  14. Jean-Michel Muller, Nicolas Brunie, Florent de Dinechin, Claude-Pierre Jeannerod, Mioara Joldeş, Vincent Lefèvre, Guillaume Melquiond, Nathalie Revol, and Serge Torres. Handbook of Floating-Point Arithmetic. 2nd ed. Birkhäuser Boston, 2018.

    Google Scholar 

  15. Behrooz Parhami. Computer Arithmetic, Algorithms and Hardware Designs. 2nd ed. Oxford University Press, 2010.

    Google Scholar 

  16. José-Alejandro Piñeiro, Javier D. Bruguera, and Jean-Michel Muller. “Faithful Powering Computation Using Table Look-Up and a Fused Accumulation Tree”. In: Symposium on Computer Arithmetic (ARITH). IEEE, 2001, pp. 40–47.

    Google Scholar 

  17. E. Quinnell, E. E. Swartzlander, and C. Lemonds. “Floating-Point Fused Multiply-Add Architectures”. In: Asilomar Conference on Signals, Circuits and Systems. IEEE, 2007, pp. 331–337.

    Google Scholar 

  18. Peter-Michael Seidel. “Multiple Path IEEE Floating-Point Fused Multiply-Add”. In: 46th International Midwest Symposium on Circuits and Systems. IEEE, 2003, pp. 1359–1362.

    Google Scholar 

  19. Lukas Sommer, Lukas Weber, Martin Kumm, and Andreas Koch. “Comparison of Arithmetic Number Formats for Inference in Sum-Product Networks on FPGAs”. In: International Symposium on Field-Programmable Custom Computing Machines (FCCM). IEEE, 2020, pp. 75–83.

    Google Scholar 

  20. Hani H. Saleh and Earl E. Swartzlander. “A Floating-Point Fused Dot-Product Unit”. In: International Conference on Computer Design (ICCD). 2008, pp. 426–431.

    Google Scholar 

  21. Eric M. Schwarz, Martin Schmookler, and Son Dao Trong. “FPU Implementations with Denormalized Numbers”. In: IEEE Transactions on Computers 54.7 (2005), pp. 825–836.

    Google Scholar 

  22. ISO/IEC. International Standard ISO/IEC 9899:2018. Programming languages – C. 2018.

    Google Scholar 

  23. Yao Tao, Gao Deyuan, Fan Xiaoya, and Ren Xianglong. “Three-Operand Floating-Point Adder”. In: 12th International Conference on Computer and Information Technology. 2012, pp. 192–196.

    Google Scholar 

  24. Yao Tao, Gao Deyuan, Fan Xiaoya, and Jari Nurmi. “Correctly Rounded Architectures for Floating-Point Multi-Operand Addition and Dot-Product Computation”. In: Application-Specific Systems, Architectures and Processors (ASAP). IEEE, 2013.

    Google Scholar 

  25. Yao Tao, Gao Deyuan, and Fan Xiaoya. “A Multi-Path Fused Add-Subtract Unit for Digital Signal Processing”. In: Computer Science and Automation Engineering (CSAE). 2012.

    Google Scholar 

  26. S. D. Trong, Martin M. Schmookler, E. M. Schwarz, and M. Kroener. “P6 Binary Floating-Point Unit”. In: Symposium on Computer Arithmetic (ARITH). IEEE, 2007, pp. 77–86.

    Google Scholar 

  27. Yohann Uguen, Florent de Dinechin, Victor Lezaud, and Steven Derrien. “Application-Specific Arithmetic in High-Level Synthesis Tools”. In: ACM Transactions on Architecture and Code Optimization 17.1 (2020).

    Google Scholar 

  28. X. Y. Yu, Y.-H. Chan, B. Curran, E. Schwarz, M. Kelly, and B. Fleischer. “A 5GHz+  128-bit Binary Floating-Point Adder for the POWER6 Processor”. In: European Solid-State Circuits Conference. 2006, pp. 166–169.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2024 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

de Dinechin, F., Kumm, M. (2024). Specialization and Fusion of Floating-Point Operators. In: Application-Specific Arithmetic. Springer, Cham. https://doi.org/10.1007/978-3-031-42808-1_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-42808-1_15

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-42807-4

  • Online ISBN: 978-3-031-42808-1

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics