Methods for High-Throughput Computation of Elementary Functions

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8384)

Abstract

Computing elementary functions on large arrays is an essential part of many machine learning and signal processing algorithms. Since the introduction of floating-point computations in mainstream processors, table lookups, division, square root, and piecewise approximations were essential components of elementary functions implementations. However, we suggest that these operations can not deliver high throughput on modern processors, and argue that algorithms which rely only on multiplication, addition, and integer operations would achieve higher performance. We propose 4 design principles for high-throughput elementary functions and suggest how to apply them to implementation of log, exp, sin, and tan functions. We evaluate the performance and accuracy of the new algorithms on three recent x86 microarchitectures and demonstrate that they compare favorably to previously published research and vendor-optimized libraries.

Keywords

Elementary functions SIMD Fused multiply-add 

References

  1. 1.
    Bailey, R.: Polar generation of random variates with the t-distribution. Math. Comput. 62(206), 779–782 (1994)MATHGoogle Scholar
  2. 2.
    Box, G., Muller, M.: A note on the generation of random normal deviates. Ann. Math. Stat. 29(2), 610–611 (1958)CrossRefMATHGoogle Scholar
  3. 3.
    Brisebarre, N., Chevillard, S.: Efficient polynomial \(L^{\infty }\)-approximations. In: 18th IEEE Symposium on Computer Arithmetic, 2007. ARITH’07. pp. 169–176. IEEE (2007)Google Scholar
  4. 4.
    Cody, W., Waite, W.: Software Manual for the Elementary Functions. Prentice-Hall, New Jersey (1980)Google Scholar
  5. 5.
    de Dinechin, F., Defour, D., Lauter, C., et al.: Fast correct rounding of elementary functions in double precision using double-extended arithmetic (2004)Google Scholar
  6. 6.
    Fog, A.: Instruction tables: lists of instruction latencies, throughputs and micro-operation breakdowns for Intel. AMD and VIA CPUs, Technical report (2012)Google Scholar
  7. 7.
    Fousse, L., Hanrot, G., Lefèvre, V., Pélissier, P., Zimmermann, P.: MPFR: a multipleprecision binary floating-point library with correct rounding. ACM Trans. Math. Softw. (TOMS) 33(2), 13 (2007)CrossRefGoogle Scholar
  8. 8.
    Gal, S.: An accurate elementary mathematical library for the ieee floating point standard. ACM Trans. Math. Softw. (TOMS) 17(1), 26–45 (1991)CrossRefMATHGoogle Scholar
  9. 9.
    Gentle, J.E.: Random Number Generation and Monte Carlo Methods. Springer, New York (2003)Google Scholar
  10. 10.
    Markstein, P.: IA-64 and Elementary Functions: Speed and Precision. Prentice Hall, New Jersey (2000)Google Scholar
  11. 11.
    Muller, J.-M.: Elementary Functions: Algorithms and Implementation. Birkhauser, Boston (1997)Google Scholar
  12. 12.
    Muller, J.-M., Brisebarre, N., de Dinechin, F., Jeannerod, C.-P., Lefevre, V., Melquiond, G., Revol, N., Stehle, D., Torres, S., Muller, J.-M., Brisebarre, N., Dinechin, F., Jeannerod, C.-P., Lefevre, V., Melquiond, G., Revol, N., Stehle, D., Torres, S.: Handbook of Floating-Point Arithmetic. Birkhauser, Boston (2010)CrossRefMATHGoogle Scholar
  13. 13.
    Ng, K.C.: Argument reduction for huge arguments: Good to the last bit (1992)Google Scholar
  14. 14.
    Payne, M., Hanek, R.: Radian reduction for trigonometric functions. ACM SIGNUM Newsl. 18(1), 19–24 (1983)CrossRefGoogle Scholar
  15. 15.
    Press, W., Teukolsky, S., Vetterling, W., Flannery, B.: Numerical Recipes: The Art of Scientific Computing. Cambridge University Press, Cambridge (2007)Google Scholar
  16. 16.
    Shibata, N.: Efficient evaluation methods of elementary functions suitable for SIMD computation. Comput. Sci.-Res. Dev. 25(1), 25–32 (2010)CrossRefGoogle Scholar
  17. 17.
    Tang, P.: Table-lookup algorithms for elementary functions and their error analysis. In: Proceedings of the 10th IEEE Symposium on Computer Arithmetic, 1991. pp. 232–236. IEEE (1991)Google Scholar
  18. 18.
    Williams, C.K.I., Rasmussen, C.E.: Gaussian Processes for Machine Learning, MIT Press, Cambridge (2006)Google Scholar
  19. 19.
    Wong, W.-F., Goto, E.: Fast evaluation of the elementary functions in single precision. IEEE Trans. Comput. 44(3), 453–457 (1995)CrossRefMATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2014

Authors and Affiliations

  1. 1.School of Computational Science and Engineering, College of ComputingGeorgia Institute of TechnologyAtlantaUSA

Personalised recommendations