Dynamic Thread Mapping Based on Machine Learning for Transactional Memory Applications

  • Márcio Castro
  • Luís Fabrício Wanderley Góes
  • Luiz Gustavo Fernandes
  • Jean-François Méhaut
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7484)


Thread mapping is an appealing approach to efficiently exploit the potential of modern chip-multiprocessors. However, efficient thread mapping relies upon matching the behavior of an application with system characteristics. In particular, Software Transactional Memory (STM) introduces another dimension due to its runtime system support. In this work, we propose a dynamic thread mapping approach to automatically infer a suitable thread mapping strategy for transactional memory applications composed of multiple execution phases with potentially different transactional behavior in each phase. At runtime, it profiles the application at specific periods and consults a decision tree generated by a Machine Learning algorithm to decide if the current thread mapping strategy should be switched to a more adequate one. We implemented this approach in a state-of-the-art STM system, making it transparent to the user. Our results show that the proposed dynamic approach presents performance improvements up to 31% compared to the best static solution.


transactional memory dynamic thread mapping machine learning 


  1. 1.
    Broquedis, F., Clet-Ortega, J., Moreaud, S., Goglin, B., Mercier, G., Thibault, S.: hwloc: a Generic Framework for Managing Hardware Affinities in HPC Applications. In: PDP, pp. 180–186. IEEE Computer Society, Pisa (2010)Google Scholar
  2. 2.
    Castro, M., Góes, L.F.W., Ribeiro, C.P., Cole, M., Cintra, M., Méhaut, J.F.: A Machine Learning-Based Approach for Thread Mapping on Transactional Memory Applications. In: HiPC. IEEE Computer Society, Bangalore (2011)Google Scholar
  3. 3.
    Castro, M., Georgiev, K., Marangonzova-Martin, V., Méhaut, J.F., Fernandes, L.G., Santana, M.: Analysis and Tracing of Applications Based on Software Transactional Memory on Multicore Architectures. In: PDP, pp. 199–206. IEEE Computer Society, Aya Napa (2011)Google Scholar
  4. 4.
    Diener, M., Madruga, F., Rodrigues, E., Alves, M., Schneider, J., Navaux, P., Heiss, H.U.: Evaluating Thread Placement Based on Memory Access Patterns for Multi-core Processors. In: HPCC, pp. 491–496. IEEE Computer Society, Melbourne (2010)Google Scholar
  5. 5.
    Felber, P., Fetzer, C., Riegel, T.: Dynamic Performance Tuning of Word-Based Software Transactional Memory. In: PPoPP, pp. 237–246. ACM, NY (2008)CrossRefGoogle Scholar
  6. 6.
    Grewe, D., O’Boyle, M.F.P.: A Static Task Partitioning Approach for Heterogeneous Systems Using OpenCL. In: Knoop, J. (ed.) CC 2011. LNCS, vol. 6601, pp. 286–305. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  7. 7.
    Hong, S., Narayanan, S.H.K., Kandemir, M., Özturk, O.: Process Variation Aware Thread Mapping for Chip Multiprocessors. In: DATE, pp. 821–826. IEEE Computer Society, Nice (2009)Google Scholar
  8. 8.
    Hong, S., Oguntebi, T., Casper, J., Bronson, N., Kozyrakis, C., Olukotun, K.: Eigenbench: A Simple Exploration Tool for Orthogonal TM Characteristics. In: IISWC, pp. 1–11. IEEE Computer Society, Atlanta (2010)Google Scholar
  9. 9.
    Larus, J., Rajwar, R.: Transactional Memory. Morgan & Claypool (2006)Google Scholar
  10. 10.
    Minh, C.C., Chung, J., Kozyrakis, C., Olukotun, K.: STAMP: Stanford Transactional Applications for Multi-Processing. In: IISWC, pp. 35–46. IEEE Computer Society, Seattle (2008)Google Scholar
  11. 11.
    Quinlan, J.R.: Induction of Decision Trees. Machine Learning 1, 81–106 (1986)Google Scholar
  12. 12.
    Terpstra, D., Jagode, H., You, H., Dongarra, J.: Collecting Performance Data with PAPI-C. In: Parallel Tools Workshop, pp. 157–173. Springer, Berlin (2010)Google Scholar
  13. 13.
    Tournavitis, G., Wang, Z., Franke, B., O’Boyle, M.F.: Towards a Holistic Approach to Auto-Parallelization: Integrating Profile-Driven Parallelism Detection and Machine-Learning Based Mapping. ACM SIGPLAN Not. 44, 177–187 (2009)CrossRefGoogle Scholar
  14. 14.
    Wang, Z., O’Boyle, M.F.: Mapping Parallelism to Multi-cores: A Machine Learning Based Approach. ACM SIGPLAN Not. 44, 75–84 (2009)CrossRefGoogle Scholar
  15. 15.
    Zhang, J., Zhai, J., Chen, W., Zheng, W.: Process Mapping for MPI Collective Communications. In: Sips, H., Epema, D., Lin, H.-X. (eds.) Euro-Par 2009. LNCS, vol. 5704, pp. 81–92. Springer, Heidelberg (2009)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Márcio Castro
    • 1
  • Luís Fabrício Wanderley Góes
    • 2
  • Luiz Gustavo Fernandes
    • 3
  • Jean-François Méhaut
    • 1
  1. 1.INRIA - CEA - LIG LaboratoryGrenoble UniversityMontbonnot Saint MartinFrance
  2. 2.Department of Computer SciencePontifical Catholic University of Minas GeraisBelo HorizonteBrazil
  3. 3.PPGCC - Pontifical Catholic University of Rio Grande do SulPorto AlegreBrazil

Personalised recommendations