Skip to main content

Abstract

We propose an OpenMP programming environment for fine-grained approximate computing on multi-core cluster architecture with shared and accuracy-reconfigurable floating-point units (FPUs). This shared-FPUs cluster dynamically characterize FP pipeline vulnerability (FPV) and expose it as metadata to a software scheduler for reducing the cost of error correction. To further reduce this cost, our programming, and runtime environment also supports controlled approximate computation through a combination of design-time and runtime techniques. We provide OpenMP extensions (as custom directives) for FP computations to specify parts of a program that can be executed approximately. We use a profiling technique to identify tolerable error significance and error rate thresholds in error-tolerant image processing applications. This information further guides an application-driven hardware FPU synthesis and optimization design flow to generate efficient FPUs. At runtime, the scheduler utilizes FPV metadata and promotes FPUs to accurate mode, or demotes them to approximate mode depending upon the code region requirements. We demonstrate the effectiveness of our approach (in terms of energy savings) on a 16-core tightly coupled cluster with eight shared-FPUs for both error-tolerant and general-purpose error-intolerant applications. This chapter provides a method for accepting errors in tightly coupled processor clusters with shared FPUs.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. M.R. Kakoee, I. Loi, L. Benini, Variation-tolerant architecture for ultra low power shared-11 processor clusters. IEEE Trans. Circuits Syst. II Express Br. 59(12), 927–931 (2012)

    Article  Google Scholar 

  2. D. Jeon, M. Seok, Z. Zhang, D. Blaauw, D. Sylvester, Design methodology for voltage-overscaled ultra-low-power systems. IEEE Trans. Circuits Syst. II Express Br. 59(12), 952–956 (2012)

    Article  Google Scholar 

  3. M.A. Breuer, Intelligible test techniques to support error-tolerance, in 13th Asian Test Symposium (ATS 2004), 15–17 November 2004 (Kenting, Taiwan, 2004), pp. 386–393

    Google Scholar 

  4. H. Esmaeilzadeh, A. Sampson, L. Ceze, D. Burger, Architecture support for disciplined approximate programming, in Proceedings of the Seventeenth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS XVII (ACM, New York, NY, USA, 2012), pp. 301–312

    Google Scholar 

  5. H. Cho, L. Leem, S. Mitra, Ersa: error resilient system architecture for probabilistic applications. IEEE Trans. Comput. Aided Des. Int. Circuits Syst. 31(4), 546–558 (2012)

    Article  Google Scholar 

  6. A. Sampson, W. Dietl, E. Fortuna, D. Gnanapragasam, L. Ceze, D. Grossman, Enerj: approximate data types for safe and general low-power computation, in Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’11 (ACM, New York, NY, USA, 2011), pp. 164–174

    Google Scholar 

  7. K.A. Bowman, J.W. Tschanz, S.L. Lu, P.A. Aseron, M.M. Khellah, A. Raychowdhury, B.M. Geuskens, C. Tokunaga, C.B. Wilkerson, T. Karnik, V.K. De, A 45 nm resilient microprocessor core for dynamic variation tolerance. IEEE J. Sol. State Circuits 46(1), 194–208 (2011)

    Article  Google Scholar 

  8. A. Rahimi, A. Marongiu, R.K. Gupta, L. Benini, A variability-aware openMP environment for efficient execution of accuracy-configurable computation on shared-FPU processor clusters, in 2013 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS) (2013), pp. 1–10

    Google Scholar 

  9. W. Baek, T.M. Chilimbi, Green: a framework for supporting energy-conscious programming using controlled approximation, in Proceedings of the 2010 ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’10 (ACM, New York, NY, USA, 2010), pp. 198–209

    Google Scholar 

  10. M.S.K. Lau, K.-V. Ling, Y.-C. Chu, Energy-aware probabilistic multiplier: Design and analysis, in Proceedings of the 2009 International Conference on Compilers, Architecture, and Synthesis for Embedded Systems, CASES ’09 (ACM, New York, NY, USA, 2009), pp. 281–290

    Google Scholar 

  11. P. Burgio, A. Marongiu, D. Heller, C. Chavet, P. Coussy, L. Benini, Openmp-based synergistic parallelization and HW acceleration for on-chip shared-memory clusters, in 15th Euromicro Conference on Digital System Design, DSD 2012, Cesme, Izmir, Turkey, September 5–8 (2012), pp. 751–758

    Google Scholar 

  12. A. Rahimi, L. Benini, R.K. Gupta, Analysis of instruction-level vulnerability to dynamic voltage and temperature variations, in Design, Automation Test in Europe Conference Exhibition (DATE) (2012), pp. 1102–1105

    Google Scholar 

  13. Flopoco: Floating-point Cores Generator. http://flopoco.gforge.inria.fr/

  14. PrimeTime VX User Guide, June 2011

    Google Scholar 

  15. TSMC 45 nm Standard Cell Library Release Note, v 120a, November 2009

    Google Scholar 

  16. R. Kumar, V. Kursun, Reversed temperature-dependent propagation delay characteristics in nanometer cmos circuits. IEEE Trans. Circuits Syst. II Express Br. 53(10), 1078–1082 (2006)

    Article  Google Scholar 

  17. E. Beigne, F. Clermidy, H. Lhermet, S. Miermont, Y. Thonnart, X.-T. Tran, A. Valentian, D. Varreau, P. Vivet, X. Popon, H. Lebreton, An asynchronous power aware and adaptive noc based circuit. IEEE J. Sol. State Circuits 44(4), 1167–1177 (2009)

    Article  Google Scholar 

  18. W. Kim, D.M. Brooks, Gu-Y. Wei, A fully-integrated 3-level DC/DC converter for nanosecond-scale DVS with fast shunt regulation, in IEEE International Solid-State Circuits Conference, ISSCC 2011, Digest of Technical Papers, San Francisco, CA, USA, 20–24 February (2011), pp. 268–270

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Abbas Rahimi .

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this chapter

Cite this chapter

Rahimi, A., Benini, L., Gupta, R.K. (2017). Accuracy-Configurable OpenMP. In: From Variability Tolerance to Approximate Computing in Parallel Integrated Architectures and Accelerators. Springer, Cham. https://doi.org/10.1007/978-3-319-53768-9_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-53768-9_9

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-53767-2

  • Online ISBN: 978-3-319-53768-9

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics