Adaptive Task Size Control on High Level Programming for GPU/CPU Work Sharing

Odajima, Tetsuya; Boku, Taisuke; Sato, Mitsuhisa; Hanawa, Toshihiro; Kodama, Yuetsu; Namyst, Raymond; Thibault, Samuel; Aumage, Olivier

doi:10.1007/978-3-319-03889-6_7

Tetsuya Odajima²¹,
Taisuke Boku^21,22,
Mitsuhisa Sato^21,22,
Toshihiro Hanawa²²,
Yuetsu Kodama^21,22,
Raymond Namyst²³,
Samuel Thibault²³ &
…
Olivier Aumage²³

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8286))

Included in the following conference series:

International Conference on Algorithms and Architectures for Parallel Processing

1160 Accesses

Abstract

On the work sharing among GPUs and CPU cores on GPU equipped clusters, it is a critical issue to keep load balance among these heterogeneous computing resources. We have been developing a run-time system for this problem on PGAS language named XcalableMP-dev/StarPU [1]. Through the development, we found the necessity of adaptive load balancing for GPU/CPU work sharing to achieve the best performance for various application codes.

In this paper, we enhance our language system XcalableMP-dev/ StarPU to add a new feature which can control the task size to be assigned to these heterogeneous resources dynamically during application execution. As a result of performance evaluation on several benchmarks, we confirmed the proposed feature correctly works and the performance with heterogeneous work sharing provides up to about 40% higher performance than GPU-only utilization even for relatively small size of problems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

SeloGPU: A Selective Off-Loading Framework for High Performance GPGPU Execution

Multitask Oriented GPU Resource Sharing and Virtualization in Cloud Environment

Two-Level Task Scheduling for Irregular Applications on GPU Platform

Article 04 November 2015

References

Odajima, T., Boku, T., Hanawa, T., Lee, J., Sato, M.: GPU/CPU Work Sharing with Parallel Language XcalableMP-dev for Parallelized Accelerated Computing. In: Sixth International Workshop on Parallel Programming Models and Systems Software for High-End Computing (P2S2), pp. 97–106 (September 2012)
Google Scholar
XcalableMP, http://www.xcalablemp.org/
Lee, J., MinhTuan, T., Odajima, T., Boku, T., Sato, M.: An Extension of XcalableMP PGAS Lanaguage for Multi-node GPU Clusters. In: HeteroPar 2011 (with EuroPar 2011), pp. 429–439 (2011)
Google Scholar
StarPU, http://runtime.bordeaux.inria.fr/StarPU/
Lee, J., Sato, M.: Implementation and Performance Evaluation of XcalableMP: A Parallel Programming Language for Distributed Memory Systems. In: Third International Workshop on Parallel Programming Models and Systems Software for High-End Computing (P2S2), pp. 413–420 (September 2010)
Google Scholar
High Performance Fortran Version 2.0, http://www.hpfpc.org/jahpf/spec/hpf-v20-j10.pdf
Texas Advanced Computing Center - GotoBlas2, http://www.tacc.utexas.edu/tacc-projects/gotoblas2
PGI Accelerator Compiler, http://www.softek.co.jp/SPG/Pgi/Accel/index.html
HMPP Workbench, http://www.caps-entreprise.com/hmpp.html
Agullo, E., Augonnet, C., Dongarra, J., Ltaief, H., Namyst, R., Thibault, S., Tomov, S.: Faster, Cheaper, Better - a Hybridization Methodology to Develop Linear Algebra Software for GPUs. In: GPU Computing Gems, vol. 2 (September 2010)
Google Scholar
Augonnet, C., Thibault, S., Namyst, R.: StarPU: a Runtime System for Scheduling Tasks over Accelerator-Based Multicore Machines. Concurrency Computat.: Pract. Exper. (March 2010)
Google Scholar

Download references

Author information

Authors and Affiliations

Graduate School of Systems and Information Engineering, University of Tsukuba, Japan
Tetsuya Odajima, Taisuke Boku, Mitsuhisa Sato & Yuetsu Kodama
Center for Computational Sciences, University of Tsukuba, Japan
Taisuke Boku, Mitsuhisa Sato, Toshihiro Hanawa & Yuetsu Kodama
University of Bordeaux - LaBRI - INRIA Bordeaux Sud-Ouest, France
Raymond Namyst, Samuel Thibault & Olivier Aumage

Authors

Tetsuya Odajima
View author publications
You can also search for this author in PubMed Google Scholar
Taisuke Boku
View author publications
You can also search for this author in PubMed Google Scholar
Mitsuhisa Sato
View author publications
You can also search for this author in PubMed Google Scholar
Toshihiro Hanawa
View author publications
You can also search for this author in PubMed Google Scholar
Yuetsu Kodama
View author publications
You can also search for this author in PubMed Google Scholar
Raymond Namyst
View author publications
You can also search for this author in PubMed Google Scholar
Samuel Thibault
View author publications
You can also search for this author in PubMed Google Scholar
Olivier Aumage
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Dipartimento di Ingegneria Industriale e dell’Informazione, Seconda Universita’ di Napoli, 81031, Aversa, Italy
Rocco Aversa
Institute of Computer Science, Cracow University of Technology, Warszawska 24, 31-155, Cracow, Poland
Joanna Kołodziej
School of Information Technology, Deakin University, Waurn Ponds, VA, Australia
Jun Zhang
Ingegnerial Elettrica e Tecnologie dell’ Infomazione, Università degli Studi sdi Napoli Federico II, 80125, Naples, Italy
Flora Amato
DIMES, Università della Calabria, via P. Bucci 41c, 87036, Rende, CS, Italy
Giancarlo Fortino

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Odajima, T. et al. (2013). Adaptive Task Size Control on High Level Programming for GPU/CPU Work Sharing. In: Aversa, R., Kołodziej, J., Zhang, J., Amato, F., Fortino, G. (eds) Algorithms and Architectures for Parallel Processing. ICA3PP 2013. Lecture Notes in Computer Science, vol 8286. Springer, Cham. https://doi.org/10.1007/978-3-319-03889-6_7

Download citation

DOI: https://doi.org/10.1007/978-3-319-03889-6_7
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-03888-9
Online ISBN: 978-3-319-03889-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Adaptive Task Size Control on High Level Programming for GPU/CPU Work Sharing

Abstract

Access this chapter

Preview

Similar content being viewed by others

SeloGPU: A Selective Off-Loading Framework for High Performance GPGPU Execution

Multitask Oriented GPU Resource Sharing and Virtualization in Cloud Environment

Two-Level Task Scheduling for Irregular Applications on GPU Platform

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Adaptive Task Size Control on High Level Programming for GPU/CPU Work Sharing

Abstract

Access this chapter

Preview

Similar content being viewed by others

SeloGPU: A Selective Off-Loading Framework for High Performance GPGPU Execution

Multitask Oriented GPU Resource Sharing and Virtualization in Cloud Environment

Two-Level Task Scheduling for Irregular Applications on GPU Platform

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation