Using Transactional Memory to Avoid Blocking in OpenMP Synchronization Directives

Don’t Wait, Speculate!
  • Lars BonnichsenEmail author
  • Artur Podobas
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9342)


OpenMP applications with abundant parallelism are often characterized by their high-performance. Unfortunately, OpenMP applications with a lot of synchronization or serialization-points perform poorly because of blocking, i.e. the threads have to wait for each other. In this paper, we present methods based on hardware transactional memory (HTM) for executing OpenMP barrier, critical, and taskwait directives without blocking. Although HTM is still relatively new in the Intel and IBM architectures, we experimentally show a 73 % performance improvement over traditional locking approaches, and 23 % better than other HTM approaches on critical sections. Speculation over barriers can decrease execution time by up-to 41 %. We expect that future systems with HTM support and more cores will have a greater benefit from our approach as they are more likely to block.



This article presents the result of a research and development work carried out in the European collaborative project PaPP (Portable and Predictable Performance on Heterogeneous Embedded Manycores) funded jointly by the ARTEMIS Joint Undertaking and national governments under the Call 2011 Project Nr. 295440.


  1. 1.
    Afek, Y., Levy, A., Morrison, A.: Software-improved hardware lock elision. In: PODC, pp. 212–221. ACM (2014)Google Scholar
  2. 2.
    Anderson, T.E.: The performance of spin lock alternatives for shared-money multiprocessors. IEEE Trans. Parallel Distrib. Syst. 1(1), 6–16 (1990)CrossRefGoogle Scholar
  3. 3.
    Bae, H., Cownie, J., Klemm, M., Terboven, C.: A user-guided locking API for the OpenMP* application program interface. In: DeRose, L., de Supinski, B.R., Olivier, S.L., Chapman, B.M., Müller, M.S. (eds.) IWOMP 2014. LNCS, vol. 8766, pp. 173–186. Springer, Heidelberg (2014) Google Scholar
  4. 4.
    Baek, W., Minh, C.C., Trautmann, M., Kozyrakis, C., Olukotun, K.: The opentm transactional application programming interface. In: 16th International Conference on Parallel Architecture and Compilation Techniques, PACT 2007, pp. 376–387. IEEE (2007)Google Scholar
  5. 5.
    Balart, J., Duran, A., Gonzàlez, M., Martorell, X., Ayguadé, E., Labarta, J.: Nanos mercurium: a research compiler for openmp. In: Proceedings of the European Workshop on OpenMP, vol. 8 (2004)Google Scholar
  6. 6.
    Brown, T., Ellen, F., Ruppert, E.: A general technique for non-blocking trees. In: PPoPP, pp. 329–342. ACM (2014)Google Scholar
  7. 7.
    Cain, H.W., Michael, M.M., Frey, B., May, C., Williams, D., Le, H.: Robust architectural support for transactional memory in the power architecture. In: ISCA, pp. 225–236. ACM (2013)Google Scholar
  8. 8.
    Dice, D., Lev, Y., Moir, M., Nussbaum, D., Olszewski, M.: Early experience with a commercial hardware transactional memory implementation. Sun Microsystems, Inc., Technical report (2009)Google Scholar
  9. 9.
    Drachsler, D., Vechev, M.T., Yahav, E.: Practical concurrent binary search trees via logical ordering. In: PPoPP, pp. 343–356. ACM (2014)Google Scholar
  10. 10.
    Duran, A., Teruel, X., Ferrer, R., Martorell, X., Ayguade, E.: Barcelona openmp tasks suite: A set of benchmarks targeting the exploitation of task parallelism in openmp. In: International Conference on Parallel Processing, ICPP 2009, pp. 124–131. IEEE (2009)Google Scholar
  11. 11.
    Intel: Programming with Intel Transactional Synchronization Extensions, June 2014Google Scholar
  12. 12.
    Martínez, J.F., Torrellas, J.: Speculative synchronization: applying thread-level speculation to explicitly parallel applications. In: ACM SIGOPS Operating Systems Review, vol. 36, pp. 18–29. ACM (2002)Google Scholar
  13. 13.
    Milovanović, M., Ferrer, R., Gajinov, V., Unsal, O.S., Cristal, A., Ayguadé, E., Valero, M.: Nebelung: execution environment for transactional openmp. Int. J. Parallel Prog. 36(3), 326–346 (2008)CrossRefzbMATHGoogle Scholar
  14. 14.
    Milovanović, M., Ferrer, R., Unsal, O.S., Cristal, A., Martorell, X., Ayguadé, E., Labarta, J., Valero, M.: Transactional memory and OpenMP. In: Chapman, B., Zheng, W., Gao, G.R., Sato, M., Ayguadé, E., Wang, D. (eds.) IWOMP 2007. LNCS, vol. 4935, pp. 37–53. Springer, Heidelberg (2008) CrossRefGoogle Scholar
  15. 15.
    Moreshet, T., Bahar, R.I., Herlihy, M.: Energy reduction in multiprocessor systems using transactional memory. In: Proceedings of the 2005 International Symposium on Low Power Electronics and Design, ISLPED 2005, pp. 331–334. IEEE (2005)Google Scholar
  16. 16.
    Natarajan, A., Mittal, N.: Fast concurrent lock-free binary search trees. In: PPoPP, pp. 317–328. ACM (2014)Google Scholar
  17. 17.
    Podobas, A., Brorsson, M., Vlassov, V.: TurboBŁYSK: scheduling for improved data-driven task performance with fast dependency resolution. In: DeRose, L., de Supinski, B.R., Olivier, S.L., Chapman, B.M., Müller, M.S. (eds.) IWOMP 2014. LNCS, vol. 8766, pp. 45–57. Springer, Heidelberg (2014) Google Scholar
  18. 18.
    Pyla, H.K., Ribbens, C., Varadarajan, S.: Exploiting coarse-grain speculative parallelism. In: ACM SIGPLAN Notices, vol. 46, pp. 555–574. ACM (2011)Google Scholar
  19. 19.
    Saha, B., Adl-Tabatabai, A.R., Hudson, R.L., Minh, C.C., Hertzberg, B.: Mcrt-stm: a high performance software transactional memory system for a multi-core runtime. In: Proceedings of the Eleventh ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pp. 187–197. ACM (2006)Google Scholar
  20. 20.
    Sato, T., Ohno, K., Nakashima, H.: A mechanism for speculative memory accesses following synchronizing operations. In: Proceedings of the 14th International Parallel and Distributed Processing Symposium, IPDPS 2000, pp. 145–154. IEEE (2000)Google Scholar
  21. 21.
    Wong, M., Ayguadé, E., Gottschlich, J., Luchangco, V., de Supinski, B.R., Bihari, B., other members of the WG21 SG5 Transactional Memory Sub-Group: Towards transactional memory for OpenMP. In: DeRose, L., de Supinski, B.R., Olivier, S.L., Chapman, B.M., Müller, M.S. (eds.) IWOMP 2014. LNCS, vol. 8766, pp. 130–145. Springer, Heidelberg (2014)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  1. 1.Technical University of DenmarkLyngbyDenmark
  2. 2.KTH Royal Institute of TechnologyStockholmSweden

Personalised recommendations