Advertisement

Efficient Dynamic Pinning of Parallelized Applications by Reinforcement Learning with Applications

  • Georgios C. Chasparis
  • Michael Rossbory
  • Vladimir Janjic
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10417)

Abstract

This paper describes a dynamic framework for mapping the threads of parallel applications to the computation cores of parallel systems. We propose a feedback-based mechanism where the performance of each thread is collected and used to drive the reinforcement-learning policy of assigning affinities of threads to CPU cores. The proposed framework is flexible enough to address different optimization criteria, such as maximum processing speed and minimum speed variance among threads. We evaluate the framework on the Ant Colony optimization parallel benchmark from the heuristic optimization application domain, and demonstrate that we can achieve an improvement of 12% in the execution time compared to the default operating system scheduling/mapping of threads under varying availability of resources (e.g. when multiple applications are running on the same system).

Notes

Acknowledgments

This work has been partially supported by the European Union grant EU H2020-ICT-2014-1 project RePhrase (No. 644235).

References

  1. 1.
    De Angelis, F., Boaro, M., Fuselli, D., Squartini, S., Piazza, F., Wei, Q.: Optimal home energy management under dynamic electrical and thermal constraints. IEEE Trans. Ind. Inform. 9(3), 1518–1527 (2013). doi: 10.1109/TII.2012.2230637. ISSN 1551-3203CrossRefGoogle Scholar
  2. 2.
    Bini, E., Buttazzo, G.C., Eker, J., Schorr, S., Guerra, R., Fohler, G., Årzén, K.E., Vanessa, R., Scordino, C.: Resource management on multicore systems: the ACTORS approach. IEEE Micro 31(3), 72–81 (2011)CrossRefGoogle Scholar
  3. 3.
    Blumofe, R., Leiserson, C.: Scheduling multithreaded computations by work stealing. In: Proceedings of SFCS 1994, pp. 356–368 (1994)Google Scholar
  4. 4.
    Brecht, T.: On the importance of parallel application placement in NUMA multiprocessors. In: Proceedings of the Symposium on Experiences with Distributed and Multiprocessor Systems (SEDMS IV), San Deigo, CA, pp. 1–18, July 1993Google Scholar
  5. 5.
    Broquedis, F., Furmento, N., Goglin, B., Wacrenier, P.A., Namyst, R.: ForestGOMP: an efficient OpenMP environment for NUMA architectures. Int. J. Parallel Program. 38, 418–439 (2010)CrossRefzbMATHGoogle Scholar
  6. 6.
    Chasparis, G.C., Shamma, J.S., Rantzer, A.: Nonconvergence to saddle boundary points under perturbed reinforcement learning. Int. J. Game Theory 44(3), 667–699 (2015)MathSciNetCrossRefzbMATHGoogle Scholar
  7. 7.
    Chasparis, G., Shamma, J.: Distributed dynamic reinforcement of efficient outcomes in multiagent coordination and network formation. Dyn. Games Appl. 2(1), 18–50 (2012)MathSciNetCrossRefzbMATHGoogle Scholar
  8. 8.
    Chasparis, G.C., Rossbory, M.: Efficient Dynamic Pinning of Parallelized Applications by Distributed Reinforcement Learning. arXiv:1606.08156 [cs], June 2016
  9. 9.
    Dorigo, M., Stützle, T.: Ant Colony Optimization. Bradford Company, Scituate (2004)zbMATHGoogle Scholar
  10. 10.
    Inaltekin, H., Wicker, S.: A one-shot random access game for wireless networks. In: International Conference on Wireless Networks, Communications and Mobile Computing (2005)Google Scholar
  11. 11.
    Klug, T., Ott, M., Weidendorfer, J., Trinitis, C.: autopin - automated optimization of thread-to-core pinning on multicore systems. In: Stenstrom, P. (ed.) Transactions on High-Performance Embedded Architectures and Compilers III. LNCS, vol. 6590, pp. 219–235. Springer, Berlin Heidelberg (2011). doi: 10.1007/978-3-642-19448-1_12 CrossRefGoogle Scholar
  12. 12.
    Mucci, P.J., Browne, S., Deane, C., Ho, G.: PAPI: A portable interface to hardware performance counters. In: Proceedings of the Department of Defense HPCMP Users Group Conference, pp. 7–10 (1999)Google Scholar
  13. 13.
    Narendra, K., Thathachar, M.: Learning Automata: An introduction. Prentice-Hall, Upper Saddle River (1989)zbMATHGoogle Scholar
  14. 14.
    Olivier, S., Porterfield, A., Wheeler, K.: Scheduling task parallelism on multi-socket multicore systems. In: ROSS 2011, Tuscon, Arizona, USA, pp. 49–56 (2011)Google Scholar
  15. 15.
    Thibault, S., Namyst, R., Wacrenier, P.-A.: Building portable thread schedulers for hierarchical multiprocessors: the bubblesched framework. In: Kermarrec, A.-M., Bougé, L., Priol, T. (eds.) Euro-Par 2007. LNCS, vol. 4641, pp. 42–51. Springer, Heidelberg (2007). doi: 10.1007/978-3-540-74466-5_6 CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Georgios C. Chasparis
    • 1
  • Michael Rossbory
    • 1
  • Vladimir Janjic
    • 2
  1. 1.Software Competence Center Hagenberg GmbHHagenbergAustria
  2. 2.School of Computer ScienceUniversity of St AndrewsScotlandUK

Personalised recommendations