Skip to main content
Log in

Efficient Dynamic Pinning of Parallelized Applications by Distributed Reinforcement Learning

  • Published:
International Journal of Parallel Programming Aims and scope Submit manuscript

Abstract

This paper introduces a resource allocation framework specifically tailored for addressing the problem of dynamic placement (or pinning) of parallelized applications to processing units. Under the proposed setup each thread of the parallelized application constitutes an independent decision maker (or agent), which (based on its own prior performance measurements and its own prior CPU-affinities) decides on which processing unit to run next. Decisions are updated recursively for each thread by a resource manager/scheduler which runs in parallel to the application’s threads and periodically records their performances and assigns to them new CPU affinities. For updating the CPU-affinities, the scheduler uses a distributed reinforcement-learning algorithm, each branch of which is responsible for assigning a new placement strategy to each thread. The proposed framework is flexible enough to address alternative optimization criteria, such as maximum average processing speed and minimum speed variance among threads. We demonstrate analytically that convergence to locally-optimal placements is achieved asymptotically. Finally, we validate these results through experiments in Linux platforms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. A strategy profile \(\sigma ^*=(\sigma _1^*,\ldots ,\sigma _n^*)\in {\varvec{\Delta }}\) is a Nash equilibrium if, no agent has incentive to change unilaterally its own strategy, i.e., no agent can increase its expected utility by altering its own strategy.

References

  1. Bini, E., Buttazzo, G.C., Eker, J., Schorr, S., Guerra, R., Fohler, G., Årzén, K.E., Vanessa, R., Scordino, C.: Resource management on multicore systems: The ACTORS approach. IEEE Micro 31(3), 72–81 (2011)

    Article  Google Scholar 

  2. Brecht, T.: On the importance of parallel application placement in NUMA multiprocessors. In: Proceedings of the Symposium on Experiences with Distributed and Multiprocessor Systems (SEDMS IV). pp. 1–18. San Deigo, CA (1993)

  3. Broquedis, F., Furmento, N., Goglin, B., Wacrenier, P.A., Namyst, R.: ForestGOMP: an efficient OpenMP environment for NUMA architectures. Int. J. Parallel Program. 38, 418–439 (2010)

    Article  MATH  Google Scholar 

  4. Chasparis, G.C., Maggio, M., Bini, E., Årzén, K.E.: Design and implementation of distributed resource management for time-sensitive applications. Automatica 64, 44–53 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  5. Chasparis, G.C., Shamma, J.S., Rantzer, A.: Nonconvergence to saddle boundary points under perturbed reinforcement learning. Int. J. Game Theory 44(3), 667–699 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  6. Chasparis, G., Shamma, J.: Distributed dynamic reinforcement of efficient outcomes in multiagent coordination and network formation. Dyn. Games Appl. 2(1), 18–50 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  7. De Angelis, F., Boaro, M., Fuselli, D., Squartini, S., Piazza, F., Wei, Q.: Optimal home energy management under dynamic electrical and thermal constraints. IEEE Trans. Ind. Inform. 9(3), 1518–1527 (2013)

  8. Inaltekin, H., Wicker, S.: A one-shot random access game for wireless networks. In: International Conference on Wireless Networks, Communications and Mobile Computing (2005)

  9. Klug, T., Ott, M., Weidendorfer, J., Trinitis, C.: autopin: automated optimization of thread-to-core pinning on multicore systems. In: Stenstrom, P. (ed.) Transactions on High-Performance Embedded Architectures and Compilers III. Lecture Notes in Computer Science, vol. 6590, pp. 219–235. Springer, Berlin (2011)

  10. Kushner, H.J., Yin, G.G.: Stochastic Approximation and Recursive Algorithms and Applications, 2nd edn. Springer, New York (2003)

    MATH  Google Scholar 

  11. Mucci, P.J., Browne, S., Deane, C., Ho, G.: PAPI: a portable interface to hardware performance counters. In: Proceedings of the Department of Defense HPCMP Users Group Conference. pp. 7–10 (1999)

  12. Narendra, K., Thathachar, M.: Learning Automata: An Introduction. Prentice-Hall, Upper Saddle River (1989)

    Google Scholar 

  13. Olivier, S., Porterfield, A., Wheeler, K.: Scheduling task parallelism on multi-socket multicore systems. In: ROSS’11. pp. 49–56. Tuscon, Arizona, USA (2011)

  14. Subrata, R., Zomaya, A.Y., Landfeldt, B.: A cooperative game framework for QoS guided job allocation schemes in grids. IEEE Trans. Comput. 57(10), 1413–1422 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  15. Tembine, H., Altman, E., ElAzouri, R., Hayel, Y.: Correlated evolutionary stable strategies in random medium access control. In: International Conference on Game Theory for Networks. pp. 212–221 (2009)

  16. Thibault, S., Namyst, R., Wacrenier, P.: Building portable thread schedulers for hierarchical multi-processors: the bubblesched framework. In: Euro-Par. ACM. Rennes, France (2007)

  17. Wei, G., Vasilakos, A.V., Zheng, Y., Xiong, N.: A game-theoretic method of fair resource allocation for cloud computing services. J. Supercomput. 54(2), 252–269 (2010)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Georgios C. Chasparis.

Additional information

This work has been supported by the European Union Grant EU H2020-ICT-2014-1 project RePhrase (No. 644235). It has also been partially supported by the Austrian Ministry for Transport, Innovation and Technology, the Federal Ministry of Science, Research and Economy, and the Province of Upper Austria in the frame of the COMET center SCCH.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chasparis, G.C., Rossbory, M. Efficient Dynamic Pinning of Parallelized Applications by Distributed Reinforcement Learning. Int J Parallel Prog 47, 24–38 (2019). https://doi.org/10.1007/s10766-017-0541-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10766-017-0541-y

Keywords

Navigation