International Journal of Parallel Programming

, Volume 43, Issue 6, pp 1218–1243

Extending Summation Precision for Network Reduction Operations

  • George Michelogiannakis
  • Xiaoye S. Li
  • David H. Bailey
  • John Shalf
Article

DOI: 10.1007/s10766-014-0326-5

Cite this article as:
Michelogiannakis, G., Li, X.S., Bailey, D.H. et al. Int J Parallel Prog (2015) 43: 1218. doi:10.1007/s10766-014-0326-5
  • 98 Downloads

Abstract

Double precision summation is at the core of numerous important algorithms such as Newton–Krylov methods and other operations involving inner products, such as matrix multiplication and dot products. However, the effectiveness of summation is limited by the accumulation of rounding errors due to compressed representations, which are an increasing problem with the scaling of modern HPC systems and data sets that can easily perform summations with millions or billions of operands. To reduce the impact of precision loss, researchers have proposed increased- and arbitrary-precision libraries that provide reproducible error or even bounded error accumulation for large sums. However, such libraries increase computation and communication time significantly, and do not always guarantee an exact result. In this article, we propose fixed-point representations of double precision variables that enable arbitrarily large summations without error and provide exact and reproducible results. We call this format big integer (BigInt). Even though such formats have been studied for local processor computations, we make the case that using fixed-point representation for distributed computation over a system-wide network is feasible with performance comparable to that of double-precision floating point summation. This is possible by the inclusion of simple and inexpensive logic into modern NICs, or by using the programmable logic found in many modern NICs, in order to accelerate performance on large-scale systems in order to avoid waking up processors.

Keywords

Computation precisionDouble-precisionDistributed summation

Copyright information

© Springer Science+Business Media New York (outside the USA) 2014

Authors and Affiliations

  • George Michelogiannakis
    • 1
  • Xiaoye S. Li
    • 1
  • David H. Bailey
    • 1
  • John Shalf
    • 1
  1. 1.Lawrence Berkeley National LaboratoryBerkeleyUSA