International Journal of Parallel Programming

, Volume 40, Issue 1, pp 25–56 | Cite as

Profiling and Optimizing Transactional Memory Applications

  • Ferad Zyulkyarov
  • Srdjan Stipic
  • Tim Harris
  • Osman S. Unsal
  • Adrián Cristal
  • Ibrahim Hur
  • Mateo Valero
Article

Abstract

Many researchers have developed applications using transactional memory (TM) with the purpose of benchmarking different implementations, and studying whether or not TM is easy to use. However, comparatively little has been done to provide general-purpose tools for profiling and optimizing programs which use transactions. In this paper we introduce a series of profiling and optimization techniques for TM applications. The profiling techniques are of three types: (i) techniques to identify multiple potential conflicts from a single program run, (ii) techniques to identify the data structures involved in conflicts by using a symbolic path through the heap, rather than a machine address, and (iii) visualization techniques to summarize how threads spend their time and which of their transactions conflict most frequently. Altogether they provide in-depth and comprehensive information about the wasted work caused by aborting transactions. To reduce the contention between transactions we suggest several TM specific optimizations which leverage nested transactions, transaction checkpoints, early release and etc. To examine the effectiveness of the profiling and optimization techniques, we provide a series of illustrations from the STAMP TM benchmark suite and from the synthetic WormBench workload. First we analyze the performance of TM applications using our profiling techniques and then we apply various optimizations to improve the performance of the Bayes, Labyrinth and Intruder applications. We discuss the design and implementation of the profiling techniques in the Bartok-STM system. We process data offline or during garbage collection, where possible, in order to minimize the probe effect introduced by profiling.

Keywords

Profiling Transaction memory Application 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Adl-Tabatabai, A.-R., Lewis, B.T., Menon, V., Murphy, B.R., Saha, B., Shpeisman, T.: Compiler and runtime support for efficient software transactional memory. In: PLDI’06: Proceedings of 2006 ACM SIGPLAN Conference on Programming Language Design and Implementation, pp. 26–37 (2006, June)Google Scholar
  2. 2.
    Ansari, M., Jarvis, K., Kotselidis, C., Lujan, M., Kirkham, C., Watson, I.: Profiling transactional memory applications. In: PDP’09: Proceedings of 17th Euromicro International Conference on Parallel, Distributed and Network-Based Processing, pp. 11–20 (2009)Google Scholar
  3. 3.
    Blundell, C., Raghavan, A., Martin, M.M.: Retcon: transactional repair without replay. In: SCA’10: Proceedings of 37th International Symposium on Computer Architecture, ISCA’10, pp. 258–269 (2010, June)Google Scholar
  4. 4.
    Bobba, J., Moore, K.E., Volos, H., Yen, L., Hill, M.D., Swift, M.M., Wood D.A. : Performance pathologies in hardware transactional memory. In: ISCA’07: Proceedings of 34th International Symposium on Computer Architecture, pp. 81–91 (2007, June)Google Scholar
  5. 5.
    Cao Minh, C., Chung, J., Kozyrakis, C., Olukotun, K.: STAMP: Stanford transactional applications for multi-processing. In: IISWC’08: Proceedings of 11th IEEE International Symposium on Workload Characterization, pp. 35–46 (2008, Sept)Google Scholar
  6. 6.
    Chafi, H., Cao Minh, C., McDonald, A., Carlstrom, B.D., Chung, J., Hammond, L., Kozyrakis, C., Olukotun, K.: TAPE: A transactional application profiling environment. In: ICS’05: Proceedings of 19th International Conference on Supercomputing, pp. 199–208 (2005, June)Google Scholar
  7. 7.
    Chafi, H., Casper, J., Carlstrom, B.D., McDonald, A., Cao Minh, C., Baek, W., Kozyrakis, C., Olukotun, K.: A scalable, non-blocking approach to transactional memory. In: HPCA’07: Proceedings of 13th IEEE International Symposium on High Performance Computer Architecture, pp. 97–108 (2007, Feb)Google Scholar
  8. 8.
    Chkrabarti, D.R.: New abstractions for effective performance analysis of STM programs. In: PPoPP ’10: Proceedings of 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pp. 333–334 (2010)Google Scholar
  9. 9.
    Dice, D., Lev, Y., Moir, M., Nussbaum D.: Early experience with a commercial hardware transactional memory implementation. In: ASPLOS ’09: Proceedings of 14th International Conference on Architectural Support for Programming Languages and Operating Systems, pp. 157–168 (2009)Google Scholar
  10. 10.
    Dice, D. Shalev, O., Shavit, N. Transactional locking II. In: DISC ’06: Proceedings of 20th ACM International Symposium on Distributed Computing, pp. 194–208 (2006, Sept)Google Scholar
  11. 11.
    Dolev, S., Hendler, D., Suissa, A.: Car-stm: scheduling-based collision avoidance and resolution for software transactional memory. In: PODC’08: Proceedings of 27th ACM Symposium on Principles of Distributed Computing, pp. 125–134 (2008, Aug)Google Scholar
  12. 12.
    Dragojević, A., Guerraoui, R., Singh, A.V., Singh, V.: Preventing versus curing: avoiding conflicts in transactional memories. In: PODC’09: Proceedings of the 28th ACM Symposium on Principles of Distributed Computing, pp. 7–16 (2009)Google Scholar
  13. 13.
    Ennals, R.: Software Transactional Memory Should not be Obstruction-Free. Technical Report IRC-TR–06–052. Intel (2006)Google Scholar
  14. 14.
    Felber, P., Fetzer, C., Müller, U., Riegel, T., Suesskraut, M. Sturzrehm, H.: Transactifying applications using an open compiler framework. In: TRANSACT’07: 2nd Workshop on Transactional Computing (2007)Google Scholar
  15. 15.
    Gajinov, V., Zyulkyarov, F., Cristal, A., Unsal, O.S., Ayguadé, E., Harris, T., Valero M.: QuakeTM: parallelizing a complex serial application using transactional memory. In: ICS’09: Proceedings of 23rd International Conference on Supercomputing, pp. 126–135 (2009, June)Google Scholar
  16. 16.
    Guerraoui R., Kapalka M.: On obstruction-free transactions. In: SPAA’08: Proceedings of of the 20th Symposium on Parallelism in Algorithms and Architectures, pp. 304–313 (2008, June)Google Scholar
  17. 17.
    Harris T., Fraser K.: Language support for lightweight transactions. In: OOPSLA’03: Proceedings of 18th ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications, pp. 388–402 (2003, Oct)Google Scholar
  18. 18.
    Harris, T., Larus, J., Rajwar, R.: Transactional Memory (Synthesis Lectures on Computer Architecture), 2nd edn (2010)Google Scholar
  19. 19.
    Harris, T., Marlow, S., Peyton Jones, S., Herlihy, M.: Composable memory transactions. In: PPoPP’05: Proceedings of 10th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pp. 48–60 (2005, Feb)Google Scholar
  20. 20.
    Harris, T., Plesko, M., Shinnar, A.,Tarditi, D.: Optimizing memory transactions. In: PLDI’06: Proceedings of 2006 ACM SIGPLAN Conference on Programming Language Design and Implementation, pp. 14–25 (2006, June)Google Scholar
  21. 21.
    Harris T., Stipic, S.: Abstract nested transactions. In: TRANSACT’07: 2nd Workshop on Transactional Computing (2007, Aug)Google Scholar
  22. 22.
    Herlihy, M., Moss, J.E.B.: Transactional memory: Architectural support for lock-free data structures. In: ISCA’93: Proceedings of 20th International Symposium on Computer Architecture, pp. 289–300 (1993, May)Google Scholar
  23. 23.
    Kestor, G., Stipic, S., Unsal, O.S., Cristal, A., Valero, M.: RMS-TM: A transactional memory benchmark for recognition, mining and synthesis applications. In: TRANSACT’09: 4th Workshop on Transactional Computing (2009, Feb)Google Scholar
  24. 24.
    Lourenço, J., Dias, R. Luís, J., Rebelo, M., Pessanha, V.: Understanding the behavior of transactional memory applications. In: PADTAD’09: Proceedings of 7th Workshop on Parallel and Distributed Systems, pp. 1–9 (2009)Google Scholar
  25. 25.
    Maldonado, W., Marlier, P., Felber, P., Suissa, A., Hendler, D., Fedorova, A., Lawall, J.L., Muller, G.: Scheduling support for transactional memory contention management. In: PPoPP’10: Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pp. 79–90 (2010)Google Scholar
  26. 26.
    Ni, Y., Welc, A., Adl-Tabatabai, A.-R., Bach, M., Berkowits, S., Cownie, J., Geva, R., Kozhukow, S., Narayanaswamy, R., Olivier, J., Preis, S., Saha, B., Tal, A., Tian, X.: Design and implementation of transactional constructs for C/C++. In: OOPSLA’08: Proceedings of 23rd ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications, pp. 195–212 (2008, Oct)Google Scholar
  27. 27.
    Pankratius, V., Adl-Tabatabai, A.-R., Otto. F.: Does Transactional Memory Keep its Promises? Results from an Empirical Study. Technical Report 2009–12, University of Karlsruhe (2009, Sept)Google Scholar
  28. 28.
    Perfumo, C., Sonmez, N. Stipic, S., Cristal, A., Unsal, O.S., Harris, T. Valero, M.: The limits of software transactional memory (STM):dissecting Haskell STM applications on a many-core environment. In: CF’08: Proceedings of 5th International Conference on Computing Frontiers, pp. 67–78 (2008, May)Google Scholar
  29. 29.
    Rossbach, C.J., Hofmann, O.S., Witchel, E.: Is transactional programming actually easier? In: PPoPP’10: Proceedings of 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (2010, Jan)Google Scholar
  30. 30.
    Skare T., Kozyrakis, C. Early release: friend or foe? In: WTMW’06: Workshop of Transactional Memory Workloads (2006)Google Scholar
  31. 31.
    Sonmez, N., Cristal, A., Unsal, O.S., Harris, T., Valero, M.: Profiling transactional memory applications on an atomic block basis: a haskell case study. In: MULTIPROG’09: Proceedings of 2nd Workshop on Programmability Issues for Multi-Core Computers (2009, Jan)Google Scholar
  32. 32.
    Sonmez, N., Harris, T., Cristal, A., Unsal, O., Valero, M.: Taking the heat off transactions: dynamic selection of pessimistic concurrency control. In: IPDPS’09: Proceedings of 23rd IEEE International Symposium on Parallel and Distributed Processing, pp. 1–10 (2009, May)Google Scholar
  33. 33.
    Sonmez, N., Perfumo, C., Stipic, S., Cristal, A., Unsal, O.S., Valero, M.: Unreadtvar: extending haskell software transactional memory for performance. In: TFP’07: Proceedings of 8th Symposium on Trends in Functional Programming (2007)Google Scholar
  34. 34.
    Spear, M.F., Dalessandro, L., Marathe, V., Scott, M.L.: Ordering-based semantics for software transactional memory. In: OPODIS ’08: Proceedings of 12th International Conference on Principles of Distributed Systems, pp. 275–294 (2008, Dec)Google Scholar
  35. 35.
    Spear, M.F., Marathe, V.J., Dalessandro, L., Scott, M.L.: Privatization techniques for software transactional memory. In: PODC’07: Proceedings of 26th ACM Symposium on Principles of Distributed Computing, pp. 338–339 (2007)Google Scholar
  36. 36.
    Tomić, S., Perfumo, C., Kulkarni, C., Armejach, A., Cristal, A., Unsal, O., Harris, T., Valero, M.: EazyHTM: eager-lazy hardware transactional memory. In: MICRO 42: Proceedings of 42nd Annual IEEE/ACM International Symposium on Microarchitecture, pp. 145–155 (2009)Google Scholar
  37. 37.
    Waliullah, M.M., Stenstrom, P.: Intermediate Checkpointing with Conflicting Access Prediction in Transactional Memory Systems. Technical report, Chalmers University of TechnologyGoogle Scholar
  38. 38.
    Watson, I., Kirkham, C., Lujan, M.: A study of a transactional parallel routing algorithm. In: PACT’07: Proceedings of 16th International Conference on Parallel Architecture and Compilation Techniques, pp. 388–398 (2007)Google Scholar
  39. 39.
    Yoo, R.M., Lee, H.-H.S.: Adaptive transaction scheduling for transactional memory systems. In: SPAA’08: Proceedings of 20th Symposium on Parallelism in Algorithms and Architectures, pp. 169–178 (2008)Google Scholar
  40. 40.
    Yoo, R.M., Ni, Y., Welc, A., Saha, B., Adl-Tabatabai, A.-R., Lee, H.-H.S.: Kicking the tires of software transactional memory: why the going gets tough. In SPAA’08: Proceedings of 20th ACM Symposium on Parallelism in Algorithms and Architectures, pp. 265–274 (2008, June)Google Scholar
  41. 41.
    Zyulkyarov, F., Cvijic, S., Unsal, O.S., Cristal, A., Ayguadé, E., Harris, T., Valero, M.: WormBench: a configurable workload for evaluating transactional memory systems. In: MEDEA’08: Proceedings of 9th Workshop on Memory Performance, pp. 61–68 (2008, Oct)Google Scholar
  42. 42.
    Zyulkyarov, F., Gajinov, V., Unsal, O.S. Cristal, A. Ayguadé, E., Harris, T. Valero, M.: Atomic Quake: using transactional memory in an interactive multiplayer game server. In: PPoPP’09: Proceedings of 14th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pp. 25–34 (2009, Feb)Google Scholar
  43. 43.
    Zyulkyarov, F., Harris, T., Unsal, O.S. Cristal, A., Valero, M.: Debugging programs that use atomic blocks and transactional memory. In: PPoPP’10: Proceedings of 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pp. 57–66 (2010)Google Scholar
  44. 44.
    Zyulkyarov, F., Stipic, S., Harris, T., Unsal, O.S., Cristal, A., Hur, I., Valero, M.:Discovering and understanding performance bottlenecks in transactional applications. In: PACT’10: Proceedings of 19th International Conference on Parallel Architectures and Compilation Techniques, pp. 285–294 (2010, Sept)Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2011

Authors and Affiliations

  • Ferad Zyulkyarov
    • 1
    • 2
  • Srdjan Stipic
    • 1
    • 2
  • Tim Harris
    • 3
  • Osman S. Unsal
    • 1
  • Adrián Cristal
    • 1
    • 4
  • Ibrahim Hur
    • 1
  • Mateo Valero
    • 1
    • 2
  1. 1.BSC-Microsoft Research CentreBarcelonaSpain
  2. 2.Universitat Politècnica de CatalunyaBarcelonaSpain
  3. 3.Microsoft ResearchBarcelonaSpain
  4. 4.IIIA-Artificial Intelligence Research InstituteCSIC-Spanish National Research CouncilBarcelonaSpain

Personalised recommendations