Advertisement

Performance Evaluation of Thread-Level Speculation in Off-the-Shelf Hardware Transactional Memories

  • Juan SalamancaEmail author
  • José Nelson Amaral
  • Guido Araujo
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10417)

Abstract

Thread-Level Speculation (TLS) is a hardware/software technique that enables the execution of multiple loop iterations in parallel, even in the presence of some loop-carried dependences. TLS requires hardware mechanisms to support conflict detection, speculative storage, in-order commit of transactions, and transaction roll-back. There is no off-the-shelf processor that provides direct support for TLS. Speculative execution is supported, however, in the form of Hardware Transactional Memory (HTM)—available in recent processors such as the Intel Core and the IBM POWER8. Earlier work has demonstrated that, in the absence of specific TLS support in commodity processors, HTM support can be used to implement TLS. This paper presents a careful evaluation of the implementation of TLS on the HTM extensions available in such machines. This evaluation provides evidence to support several important claims about the performance of TLS over HTM in the Intel Core and the IBM POWER8 architectures. Experimental results reveal that by implementing TLS on top of HTM, speed-ups of up to 3.8\(\times \) can be obtained for some loops.

Keywords

Thread-Level Speculation Transactional memory 

Notes

Acknowledgements

The authors would like to thank FAPESP (grants 15/04285-5, 15/12077-3, and 13/08293-7) and the NSERC for supporting this work.

References

  1. 1.
    cTuning Foundation: cBench: Collective benchmarks (2016). http://ctuning.org/cbench
  2. 2.
    Hurson, A.R., Lim, J.T., Kavi, K.M., Lee, B.: Parallelization of doall and doacross loops-a survey. Adv. Comput. 45, 53–103 (1997)CrossRefGoogle Scholar
  3. 3.
  4. 4.
    Intel Corporation: Intel architecture instruction set extensions programming reference. Intel transactional synchronization extensions, Chap. 8 (2012)Google Scholar
  5. 5.
    Le, H., Guthrie, G., Williams, D., Michael, M., Frey, B., Starke, W., May, C., Odaira, R., Nakaike, T.: Transactional memory support in the IBM POWER8 processor. IBM J. Res. Dev. 59(1), 8:1–8:14 (2015)CrossRefGoogle Scholar
  6. 6.
    Murphy, N., Jones, T., Mullins, R., Campanoni, S.: Performance implications of transient loop-carried data dependences in automatically parallelized loops. In: International Conference on Compiler Construction (CC), pp. 23–33, Barcelona, Spain (2016)Google Scholar
  7. 7.
    Nakaike, T., Odaira, R., Gaudet, M., Michael, M.M., Tomari, H.: Quantitative comparison of hardware transactional memory for Blue Gene/Q, zEnterprise EC12, Intel Core, and POWER8. In: International Conference on Computer Architecture (ISCA), pp. 144–157, Portland, OR (2015)Google Scholar
  8. 8.
    Odaira, R., Nakaike, T.: Thread-level speculation on off-the-shelf hardware transactional memory. In: International Symposium on Workload Characterization (IISWC), pp. 212–221, Atlanta, Georgia, USA, October 2014Google Scholar
  9. 9.
    Salamanca, J., Amaral, J.N., Araujo, G.: Evaluating and improving thread-level speculation in hardware transactional memories. In: IEEE International Parallel and Distributed Processing Symposium (IPDPS), pp. 586–595, Chicago, USA (2016)Google Scholar
  10. 10.
    Steffan, J., Mowry, T.: The potential for using thread-level data speculation to facilitate automatic parallelization. In: High Performance Computer Architecture (HPCA), p. 2, Washington, DC, USA (1998)Google Scholar
  11. 11.
    Steffan, J.G., Colohan, C.B., Zhai, A., Mowry, T.C.: A scalable approach to thread-level speculation. In: International Conference on Computer Architecture (ISCA), pp. 1–12, Vancouver, BC, Canada (2000)Google Scholar
  12. 12.
    Tournavitis, G., Wang, Z., Franke, B., O’Boyle, M.F.: Towards a holistic approach to auto-parallelization: integrating profile-driven parallelism detection and machine-learning based mapping. In: Programming Language Design and Implementation (PLDI), pp. 177–187, PLDI 2009, ACM, Dublin, Ireland (2009)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Juan Salamanca
    • 1
    Email author
  • José Nelson Amaral
    • 2
  • Guido Araujo
    • 1
  1. 1.Institute of ComputingUNICAMPCampinasBrazil
  2. 2.Computing Science DepartmentUniversity of AlbertaEdmontonCanada

Personalised recommendations