Floating-Point Computation with Just Enough Accuracy

Dietz, Hank; Dieter, Bill; Fisher, Randy; Chang, Kungyen

doi:10.1007/11758501_34

Hank Dietz²⁰,
Bill Dieter²⁰,
Randy Fisher²⁰ &
…
Kungyen Chang²⁰

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 3991))

Included in the following conference series:

International Conference on Computational Science

1192 Accesses
4 Citations

Abstract

Most mathematical formulae are defined in terms of operations on real numbers, but computers can only operate on numeric values with finite precision and range. Using floating-point values as real numbers does not clearly identify the precision with which each value must be represented. Too little precision yields inaccurate results; too much wastes computational resources.

The popularity of multimedia applications has made fast hardware support for low-precision floating-point arithmetic common in Digital Signal Processors (DSPs), SIMD Within A Register (SWAR) instruction set extensions for general purpose processors, and in Graphics Processing Units (GPUs). In this paper, we describe a simple approach by which the speed of these low-precision operations can be speculatively employed to meet user-specified accuracy constraints. Where the native precision(s) yield insufficient accuracy, a simple technique is used to efficiently synthesize enhanced precision using pairs of native values.

Download to read the full chapter text

Chapter PDF

Floating-point output

Converting floating-point values to integers

Floating-point input

References

Dietz, H.G., Fisher, R.J.: Compiling for SIMD within a register. In: Chatterjee, S., Prins, J.F., Carter, L., Ferrante, J., Li, Z., Sehr, D., Yew, P.C. (eds.) Languages and Compilers for Parallel Computing, pp. 290–304. Springer, Heidelberg (1999)
Google Scholar
Bailey, D.H., Hida, Y., Jeyabalan, K., Li, X.S., Thompson, B.: Multiprecision software directory (2006), http://crd.lbl.gov/~dhbailey/mpdist/
Dekker, T.J.: A floating-point technique for extending the available precision. Numer. Math. 18, 224–242 (1971)
Article MATH MathSciNet Google Scholar
Linnainmaa, S.: Software for doubled-precision floating-point computations. ACM Trans. Math. Softw. 7(3), 272–283 (1981)
Article MATH MathSciNet Google Scholar
IEEE: IEEE Standard for Binary Floating Point Arithmetic Std. 754-1985 (1985)
Google Scholar
Texas Instruments: TMS320C3x User’s Guide (2004)
Google Scholar
Advanced Micro Devices: 3DNow! Technology Manual (2000)
Google Scholar
Bailey, D.H.: Algorithm 719; Multiprecision translation and execution of FORTRAN programs. ACM Trans. Math. Softw. 19(3), 288–319 (1993)
Article MATH Google Scholar
Advanced Micro Devices: AMD Athlon Processor x86 Code Optimization Guide (2002)
Google Scholar
Intel: Intel Pentium 4 and Intel Xeon Processor Optimization Reference Manual (2002)
Google Scholar
Advanced Micro Devices: AMD64 Architecture Programmer’s Manual Volume 5: 64-Bit Media and x87 Floating-Point Instructions (2003)
Google Scholar
Klimovitski, A.: Using SSE and SSE2: Misconceptions and reality. Intel Developer UPDATE Magazine (2001)
Google Scholar
Smith, K.B., Bik, A.J.C., Tian, X.: Support for the Intel Pentium 4 processor with hyper-threading technology in Intel 8.0 compilers. Intel Technology Journal 08 (2004) ISSN 1535-864X
Google Scholar
Advanced Micro Devices: AMD64 Architecture Programmer’s Manual Volume 4: 128-Bit Media Instructions (2003)
Google Scholar
Freescale Semiconductor: AltiVec Technology Programming Interface Manual (1999)
Google Scholar
Microsoft: DirectX Graphics Reference (2006)
Google Scholar
Silicon Graphics, Inc: OpenGL Extension Registry (2003)
Google Scholar
Fisher, R.J., Dietz, H.G.: The Scc Compiler: SWARing at MMX and 3DNow. In: Carter, L., Ferrante, J. (eds.) Languages and Compilers for Parallel Computing, pp. 399–414. Springer, Heidelberg (2000)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electrical & Computer Engineering, University of Kentucky,
Hank Dietz, Bill Dieter, Randy Fisher & Kungyen Chang

Authors

Hank Dietz
View author publications
You can also search for this author in PubMed Google Scholar
Bill Dieter
View author publications
You can also search for this author in PubMed Google Scholar
Randy Fisher
View author publications
You can also search for this author in PubMed Google Scholar
Kungyen Chang
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Advanced Computing and Emerging Technologies Centre, The School of Systems Engineering, University of Reading, RG6 6AY, Reading, United Kingdom
Vassil N. Alexandrov
Department of Mathematics and Computer Science, University of Amsterdam, Kruislaan 403, 1098, Amsterdam, SJ, The Netherlands
Geert Dick van Albada
Faculty of Sciences, Section of Computational Science, University of Amsterdam, Kruislaan 403, 1098, Amsterdam, SJ, The Netherlands
Peter M. A. Sloot
Computer Science Department, University of Tennessee, 37996-3450, Knoxville, TN, USA
Jack Dongarra

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Dietz, H., Dieter, B., Fisher, R., Chang, K. (2006). Floating-Point Computation with Just Enough Accuracy. In: Alexandrov, V.N., van Albada, G.D., Sloot, P.M.A., Dongarra, J. (eds) Computational Science – ICCS 2006. ICCS 2006. Lecture Notes in Computer Science, vol 3991. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11758501_34

Download citation

DOI: https://doi.org/10.1007/11758501_34
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-34379-0
Online ISBN: 978-3-540-34380-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Floating-Point Computation with Just Enough Accuracy

Abstract

Chapter PDF

Similar content being viewed by others

Floating-point output

Converting floating-point values to integers

Floating-point input

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Floating-Point Computation with Just Enough Accuracy

Abstract

Chapter PDF

Similar content being viewed by others

Floating-point output

Converting floating-point values to integers

Floating-point input

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation