Skip to main content

Toward a software transactional memory for heterogeneous CPU–GPU processors

Abstract

The heterogeneous accelerated processing units (APUs) integrate a multi-core CPU and a GPU within the same chip. Modern APUs implement CPU–GPU platform atomics for simple data types. However, ensuring atomicity for complex data types is a task delegated to programmers. Transactional memory (TM) is an optimistic approach to achieve this goal. With TM, shared data can be accessed by multiple computing threads speculatively, but changes are only visible if a transaction ends with no conflict with others in its memory accesses. In this paper we present APUTM, a software TM designed for APU processors which focuses on minimizing the access to shared metadata. The main goal of APUTM is to understand the trade-offs of implementing a software TM on such platform. In our experiments, APUTM is able to outperform sequential execution of the applications. Additionally, we compare its adaptability to execute in one of the devices or in both simultaneously.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Notes

  1. http://www.hsafoundation.com/.

  2. https://github.com/RadeonOpenCompute/ROCm.

References

  1. Adir A, Goodman D et al (2014) Verification of transactional memory in power 8. In: 51st Annual Design Automation Conference (DAC’14), pp 1–6

  2. Cederman D, Tsigas P, Chaudhry MT (2010) Towards a software transactional memory for graphics processors. In 10th Eurographics Conference on Parallel Graphics and Visualization (EG PGV’10), pp 121–129

  3. Chen S, Peng L (2016) Efficient GPU hardware transactional memory through early conflict resolution. In: 22nd International Symposium on High Performance Computer Architecture (HPCA’16)

  4. Dalessandro L, Scott ML (2012) Strong isolation is a weak idea. In: International Conference on Parallel Architectures and Compilation Techniques (PACT’12)

  5. Dalessandro L, Spear MF, Scott ML (2010) NOrec: streamlining STM by abolishing ownership records. In: Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP ’10, New York, NY, USA. ACM, pp 67–78

  6. Dice D, Shalev O, Shavit N (2006) Transactional locking II. Springer, Berlin, pp 194–208

    Google Scholar 

  7. Dragojević A, Guerraoui R, Kapalka M (2009) Stretching transactional memory. In: Proceedings of the 30th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’09, New York, NY, USA. ACM, pp 155–165

  8. Felber P, Fetzer C, Riegel T, Marlier P (2010) Time-based software transactional memory. IEEE Trans Parallel Distrib Syst 21:1793–1807

    Article  Google Scholar 

  9. Fung WWL, Aamodt TM (2013) Energy efficient GPU transactional memory via space-time optimizations. In: 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’13), pp 408–420

  10. Fung WWL, Singh I, Brownsword A, Aamodt TM (2011) Hardware transactional memory for GPU architectures. In: 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’11), pp 296–307

  11. Guerraoui R, Kapalka M (2008) On the correctness of transactional memory. In: Proceedings of the 13th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP ’08, New York, NY, USA. ACM, pp 175–184

  12. Harris T, Larus J, Rajwar R (2010) Transactional memory, 2nd edn. Morgan & Claypool Publishers, San Rafael

    Google Scholar 

  13. Herlihy M, Moss JEB (1993) Transactional memory: architectural support for lock-free data structures. In: 20th Annual International Symposium on Computer Architecture (ISCA’93), pp 289–300

  14. Holey A, Zhai A (2014) Lightweight software transactions on GPUs. In: 43rd International Conference on Parallel Processing (ICPP’14), pp 461–470

  15. Jacobi C, Siegel T, Greiner D (2012) Transactional memory architecture and implementation for IBM System z. In: 45th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’12), pp 25–36

  16. Ruan W, Liu Y, Spear M (2015) Transactional read-modify-write without aborts. ACM Trans Archit Code Optim 11(4):63:1–63:4

    Article  Google Scholar 

  17. Shen Q, Sharp C, Blewitt W, Ushaw G, Morgan G (2015) PR-STM: priority rule based software transactions for the GPU. Springer, Berlin, pp 361–372

    Google Scholar 

  18. Villegas A, Asenjo R, Navarro A, Plata O, Ubal R, Kaeli D (2017) Hardware support for scratchpad memory transactions on GPU architectures. Springer, Cham, pp 273–286

    Google Scholar 

  19. Wang A, Gaudet M, Wu P, Amaral J, Ohmacht M, Barton C, Silvera R, and Michael M (2012) Evaluation of BlueGene/Q hardware support for transactional memories. In: 21st International Conference on Parallel Architectures and Compilation Techniques (PACT’12), pp 127–136

  20. Xu Y, Wang R, Goswami N, Li T, Gao L, Qian D (2014) Software transactional memory for GPU architectures. In Annual IEEE/ACM International Symposium on Code Generation and Optimization (CGO’14), pp 1:1–1:10

  21. Yoo RM, Hughes CJ, Lai K, Rajwar R (2013) Performance evaluation of Intel transactional synchronization extensions for high-performance computing. In: International Conference for High Performance Computing, Networking, Storage and Analysis (SC’13), pp 19:1–19:11

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alejandro Villegas.

Additional information

This work has been supported by projects TIN2013-42253-P and TIN2016-80920-R, from the Spanish Government, and P11-TIC8144 and P12-TIC1470, from Junta de Andalucia.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Villegas, A., Navarro, A., Asenjo, R. et al. Toward a software transactional memory for heterogeneous CPU–GPU processors. J Supercomput 75, 4177–4192 (2019). https://doi.org/10.1007/s11227-018-2347-0

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-018-2347-0

Keywords

  • Transactional memory
  • APU processors
  • Parallel programming
  • Data sharing