Specialization and Fusion of Floating-Point Operators

de Dinechin, Florent; Kumm, Martin

doi:10.1007/978-3-031-42808-1_15

Florent de Dinechin³ &
Martin Kumm⁴

148 Accesses

Abstract

This chapter studies the possible optimizations that arise in the specialization or fusion of floating-point operators. It builds upon the specialized fixed operators of previous chapters and focuses on exponent management and rounding issues specific to floating point.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 139.00; Price excludes VAT (USA)

Hardcover Book: USD 179.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Strictly speaking, as the exponent is constant, the point does not actually float here.
2.
Of course, trusting the compiler to respect language standards is necessary but not sufficient to ensure safe applications.

References

Nicolas Brisebarre, Florent de Dinechin, and Jean-Michel Muller. “Integer and Floating-Point Constant Multipliers for FPGAs”. In: International Conference on Application-Specific Systems, Architectures and Processors (ASAP). IEEE, 2008, pp. 239–244.
Google Scholar
Javier D. Bruguera and Tomás Lang. “Floating-point Fused Multiply-Add: Reduced Latency for Floating-Point Addition”. In: Symposium on Computer Arithmetic (ARITH). IEEE, 2005.
Google Scholar
Nicolas Brisebarre and Jean-Michel Muller. “Correctly Rounded Multiplication by Arbitrary Precision Constants”. In: IEEE Transactions on Computers 57.2 (2008), pp. 165–174.
Google Scholar
Marius Cornea, John Harrison, and Ping Tak Peter Tang. Scientific Computing on Itanium^®-Based Systems. Intel Press, 2002.
Google Scholar
Vassil Dimitrov, Laurent Imbert, and Andrew Zakaluzny. “Multiplication by a Constant is Sublinear”. In: Symposium on Computer Arithmetic (ARITH). IEEE, 2007, pp. 261–268.
Google Scholar
Florent de Dinechin, Cristian Klein, and Bogdan Pasca. “Generating High-Performance Custom Floating-Point Pipelines”. In: International Conference on Field-Programmable Logic and Applications (FPL). IEEE, 2009, pp. 59–64.
Google Scholar
E. Hokenek, R. K. Montoye, and P. W. Cook. “Second-Generation RISC Floating Point with Multiply-Add Fused”. In: IEEE Journal of Solid-State Circuits 25.5 (1990), pp. 1207–1213.
Google Scholar
Tomás Lang and Javier D. Bruguera. “Floating-Point Multiply-Add-Fused with Reduced Latency”. In: IEEE Transactions on Computers 53.8 (2004), pp. 988–1003.
Google Scholar
David Lutz. “Fused Multiply-Add Microarchitecture Comprising Separate Early-Normalizing Multiply and Add Pipelines”. In: Symposium on Computer Arithmetic (ARITH). IEEE, 2011, pp. 123–128.
Google Scholar
Peter Markstein. IA-64 and Elementary Functions: Speed and Precision. Hewlett-Packard Professional Books. Prentice Hall, 2000.
Google Scholar
R. K. Montoye, E. Hokonek, and S. L. Runyan. “Design of the IBM RISC System/6000 floating-point execution unit”. In: IBM Journal of Research and Development 34.1 (1990), pp. 59–70.
Google Scholar
Jae Hong Min and Earl E. Swartzlander. “Fused Floating-Point Two-term Sum-of-Squares Unit”. In: Application-Specific Systems, Architectures and Processors (ASAP). IEEE, 2013.
Google Scholar
Steven S. Muchnick. Advanced Compiler Design and Implementation. Morgan Kaufmann, 1997.
Google Scholar
Jean-Michel Muller, Nicolas Brunie, Florent de Dinechin, Claude-Pierre Jeannerod, Mioara Joldeş, Vincent Lefèvre, Guillaume Melquiond, Nathalie Revol, and Serge Torres. Handbook of Floating-Point Arithmetic. 2nd ed. Birkhäuser Boston, 2018.
Google Scholar
Behrooz Parhami. Computer Arithmetic, Algorithms and Hardware Designs. 2nd ed. Oxford University Press, 2010.
Google Scholar
José-Alejandro Piñeiro, Javier D. Bruguera, and Jean-Michel Muller. “Faithful Powering Computation Using Table Look-Up and a Fused Accumulation Tree”. In: Symposium on Computer Arithmetic (ARITH). IEEE, 2001, pp. 40–47.
Google Scholar
E. Quinnell, E. E. Swartzlander, and C. Lemonds. “Floating-Point Fused Multiply-Add Architectures”. In: Asilomar Conference on Signals, Circuits and Systems. IEEE, 2007, pp. 331–337.
Google Scholar
Peter-Michael Seidel. “Multiple Path IEEE Floating-Point Fused Multiply-Add”. In: 46th International Midwest Symposium on Circuits and Systems. IEEE, 2003, pp. 1359–1362.
Google Scholar
Lukas Sommer, Lukas Weber, Martin Kumm, and Andreas Koch. “Comparison of Arithmetic Number Formats for Inference in Sum-Product Networks on FPGAs”. In: International Symposium on Field-Programmable Custom Computing Machines (FCCM). IEEE, 2020, pp. 75–83.
Google Scholar
Hani H. Saleh and Earl E. Swartzlander. “A Floating-Point Fused Dot-Product Unit”. In: International Conference on Computer Design (ICCD). 2008, pp. 426–431.
Google Scholar
Eric M. Schwarz, Martin Schmookler, and Son Dao Trong. “FPU Implementations with Denormalized Numbers”. In: IEEE Transactions on Computers 54.7 (2005), pp. 825–836.
Google Scholar
ISO/IEC. International Standard ISO/IEC 9899:2018. Programming languages – C. 2018.
Google Scholar
Yao Tao, Gao Deyuan, Fan Xiaoya, and Ren Xianglong. “Three-Operand Floating-Point Adder”. In: 12th International Conference on Computer and Information Technology. 2012, pp. 192–196.
Google Scholar
Yao Tao, Gao Deyuan, Fan Xiaoya, and Jari Nurmi. “Correctly Rounded Architectures for Floating-Point Multi-Operand Addition and Dot-Product Computation”. In: Application-Specific Systems, Architectures and Processors (ASAP). IEEE, 2013.
Google Scholar
Yao Tao, Gao Deyuan, and Fan Xiaoya. “A Multi-Path Fused Add-Subtract Unit for Digital Signal Processing”. In: Computer Science and Automation Engineering (CSAE). 2012.
Google Scholar
S. D. Trong, Martin M. Schmookler, E. M. Schwarz, and M. Kroener. “P6 Binary Floating-Point Unit”. In: Symposium on Computer Arithmetic (ARITH). IEEE, 2007, pp. 77–86.
Google Scholar
Yohann Uguen, Florent de Dinechin, Victor Lezaud, and Steven Derrien. “Application-Specific Arithmetic in High-Level Synthesis Tools”. In: ACM Transactions on Architecture and Code Optimization 17.1 (2020).
Google Scholar
X. Y. Yu, Y.-H. Chan, B. Curran, E. Schwarz, M. Kelly, and B. Fleischer. “A 5GHz+ 128-bit Binary Floating-Point Adder for the POWER6 Processor”. In: European Solid-State Circuits Conference. 2006, pp. 166–169.
Google Scholar

Download references

Author information

Authors and Affiliations

CITI laboratory, INSA-Lyon, Villeurbanne, France
Florent de Dinechin
Fulda University of Applied Sciences, Fulda, Germany
Martin Kumm

Authors

Florent de Dinechin
View author publications
You can also search for this author in PubMed Google Scholar
Martin Kumm
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

de Dinechin, F., Kumm, M. (2024). Specialization and Fusion of Floating-Point Operators. In: Application-Specific Arithmetic. Springer, Cham. https://doi.org/10.1007/978-3-031-42808-1_15

Download citation

DOI: https://doi.org/10.1007/978-3-031-42808-1_15
Published: 23 August 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-42807-4
Online ISBN: 978-3-031-42808-1
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics