Abstract
We propose an OpenMP programming environment for fine-grained approximate computing on multi-core cluster architecture with shared and accuracy-reconfigurable floating-point units (FPUs). This shared-FPUs cluster dynamically characterize FP pipeline vulnerability (FPV) and expose it as metadata to a software scheduler for reducing the cost of error correction. To further reduce this cost, our programming, and runtime environment also supports controlled approximate computation through a combination of design-time and runtime techniques. We provide OpenMP extensions (as custom directives) for FP computations to specify parts of a program that can be executed approximately. We use a profiling technique to identify tolerable error significance and error rate thresholds in error-tolerant image processing applications. This information further guides an application-driven hardware FPU synthesis and optimization design flow to generate efficient FPUs. At runtime, the scheduler utilizes FPV metadata and promotes FPUs to accurate mode, or demotes them to approximate mode depending upon the code region requirements. We demonstrate the effectiveness of our approach (in terms of energy savings) on a 16-core tightly coupled cluster with eight shared-FPUs for both error-tolerant and general-purpose error-intolerant applications. This chapter provides a method for accepting errors in tightly coupled processor clusters with shared FPUs.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
M.R. Kakoee, I. Loi, L. Benini, Variation-tolerant architecture for ultra low power shared-11 processor clusters. IEEE Trans. Circuits Syst. II Express Br. 59(12), 927–931 (2012)
D. Jeon, M. Seok, Z. Zhang, D. Blaauw, D. Sylvester, Design methodology for voltage-overscaled ultra-low-power systems. IEEE Trans. Circuits Syst. II Express Br. 59(12), 952–956 (2012)
M.A. Breuer, Intelligible test techniques to support error-tolerance, in 13th Asian Test Symposium (ATS 2004), 15–17 November 2004 (Kenting, Taiwan, 2004), pp. 386–393
H. Esmaeilzadeh, A. Sampson, L. Ceze, D. Burger, Architecture support for disciplined approximate programming, in Proceedings of the Seventeenth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS XVII (ACM, New York, NY, USA, 2012), pp. 301–312
H. Cho, L. Leem, S. Mitra, Ersa: error resilient system architecture for probabilistic applications. IEEE Trans. Comput. Aided Des. Int. Circuits Syst. 31(4), 546–558 (2012)
A. Sampson, W. Dietl, E. Fortuna, D. Gnanapragasam, L. Ceze, D. Grossman, Enerj: approximate data types for safe and general low-power computation, in Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’11 (ACM, New York, NY, USA, 2011), pp. 164–174
K.A. Bowman, J.W. Tschanz, S.L. Lu, P.A. Aseron, M.M. Khellah, A. Raychowdhury, B.M. Geuskens, C. Tokunaga, C.B. Wilkerson, T. Karnik, V.K. De, A 45 nm resilient microprocessor core for dynamic variation tolerance. IEEE J. Sol. State Circuits 46(1), 194–208 (2011)
A. Rahimi, A. Marongiu, R.K. Gupta, L. Benini, A variability-aware openMP environment for efficient execution of accuracy-configurable computation on shared-FPU processor clusters, in 2013 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS) (2013), pp. 1–10
W. Baek, T.M. Chilimbi, Green: a framework for supporting energy-conscious programming using controlled approximation, in Proceedings of the 2010 ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’10 (ACM, New York, NY, USA, 2010), pp. 198–209
M.S.K. Lau, K.-V. Ling, Y.-C. Chu, Energy-aware probabilistic multiplier: Design and analysis, in Proceedings of the 2009 International Conference on Compilers, Architecture, and Synthesis for Embedded Systems, CASES ’09 (ACM, New York, NY, USA, 2009), pp. 281–290
P. Burgio, A. Marongiu, D. Heller, C. Chavet, P. Coussy, L. Benini, Openmp-based synergistic parallelization and HW acceleration for on-chip shared-memory clusters, in 15th Euromicro Conference on Digital System Design, DSD 2012, Cesme, Izmir, Turkey, September 5–8 (2012), pp. 751–758
A. Rahimi, L. Benini, R.K. Gupta, Analysis of instruction-level vulnerability to dynamic voltage and temperature variations, in Design, Automation Test in Europe Conference Exhibition (DATE) (2012), pp. 1102–1105
Flopoco: Floating-point Cores Generator. http://flopoco.gforge.inria.fr/
PrimeTime VX User Guide, June 2011
TSMC 45 nm Standard Cell Library Release Note, v 120a, November 2009
R. Kumar, V. Kursun, Reversed temperature-dependent propagation delay characteristics in nanometer cmos circuits. IEEE Trans. Circuits Syst. II Express Br. 53(10), 1078–1082 (2006)
E. Beigne, F. Clermidy, H. Lhermet, S. Miermont, Y. Thonnart, X.-T. Tran, A. Valentian, D. Varreau, P. Vivet, X. Popon, H. Lebreton, An asynchronous power aware and adaptive noc based circuit. IEEE J. Sol. State Circuits 44(4), 1167–1177 (2009)
W. Kim, D.M. Brooks, Gu-Y. Wei, A fully-integrated 3-level DC/DC converter for nanosecond-scale DVS with fast shunt regulation, in IEEE International Solid-State Circuits Conference, ISSCC 2011, Digest of Technical Papers, San Francisco, CA, USA, 20–24 February (2011), pp. 268–270
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this chapter
Cite this chapter
Rahimi, A., Benini, L., Gupta, R.K. (2017). Accuracy-Configurable OpenMP. In: From Variability Tolerance to Approximate Computing in Parallel Integrated Architectures and Accelerators. Springer, Cham. https://doi.org/10.1007/978-3-319-53768-9_9
Download citation
DOI: https://doi.org/10.1007/978-3-319-53768-9_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-53767-2
Online ISBN: 978-3-319-53768-9
eBook Packages: EngineeringEngineering (R0)