Batch Method for Efficient Resource Sharing in Real-Time Multi-GPU Systems

Verner, Uri; Mendelson, Avi; Schuster, Assaf

doi:10.1007/978-3-642-45249-9_23

Uri Verner²⁰,
Avi Mendelson²⁰ &
Assaf Schuster²⁰

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8314))

Included in the following conference series:

International Conference on Distributed Computing and Networking

1870 Accesses
6 Citations

Abstract

The performance of many GPU-based systems depends heavily on the effective bandwidth for transferring data between the processors. For real-time systems, the importance of data transfer rates may be even higher due to non-deterministic transfer times that limit the ability to satisfy response time requirements. We present a new method that allows real-time applications to make efficient use of the communication infrastructure in multi-GPU systems, while retaining the necessary execution time predictability. Our method is based on a new application interface for executing batch operations composed of multiple command streams that can be executed in parallel. The new interface provides the run-time with information it needs to optimize the communication and to reduce the execution time. The method is compliant with common scheduling algorithms, such as EDF and RM, as it provides accurate offline execution time prediction for jobs using their definition and system characteristics.

Experiments with two multi-GPU systems show that our method achieves 7.9x shorter execution time than the bandwidth allocation method, and 39 % higher image resolution than the time division method, for realistic applications.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Zapata, O.U.P., Alvarez, P.M.: EDF and RM multiprocessor scheduling algorithms: Survey and performance evaluation. Queue, pp. 1–24 (2005)
Google Scholar
Baruah, S., Goossens, J.: Handbook of Scheduling: Algorithms, Models, and Performance Analysis. Chapman Hall/CRC Press (2004)
Google Scholar
NVIDIA Corporation, CUDA API Reference Manual, version 5.0 (2012)
Google Scholar
Jeffay, K., Stanat, D., Martel, C.: On non-preemptive scheduling of period and sporadic tasks. In: Real-Time Systems Symposium, pp. 129–139 (1991)
Google Scholar
Lehoczky, J.P., Sha, L.: Performance of real-time bus scheduling algorithms. ACM SIGMETRICS Performance Evaluation Review 14, 44–53 (1986)
Article Google Scholar
Natale, M., Meschi, A.: Scheduling messages with earliest deadline techniques. Real-Time Systems (1993), 255–285 (2001)
Google Scholar
Sinnen, O., Sousa, L.A., Member, S.: Communication contention in task scheduling. IEEE Transactions on Parallel and Distributed Systems 16(6), 503–515 (2005)
Article Google Scholar
Balman, M.: Data transfer scheduling with advance reservation and provisioning. Ph.D. dissertation, Louisiana State University (2010)
Google Scholar
Kato, S., Lakshmanan, K.: RGEM: A responsive GPGPU execution model for runtime engines. In: Real-Time Systems Symposium (RTSS), pp. 57–66 (November 2011)
Google Scholar
Basaran, C., Kang, K.-D.: Supporting preemptive task executions and memory copies in GPGPUs. In: Euromicro Conference on Real-Time Systems, pp. 287–296 (July 2012)
Google Scholar
Verner, U., Schuster, A., Silberstein, M., Mendelson, A.: Scheduling processing of real-time data streams on heterogeneous multi-GPU systems. In: International Systems and Storage Conference (SYSTOR), pp. 1–12 (2012)
Google Scholar
Kato, S., Aumiller, J., Brandt, S.: Zero-copy I/O processing for low-latency GPU computing. In: International Conference on Cyber-Physical Systems (ICCPS 2013), pp. 170–178 (2013)
Google Scholar
Augonnet, C., Clet-Ortega, J., Thibault, S., Namyst, R.: Data-aware task scheduling on multi-accelerator based platforms. In: International Conference on Parallel and Distributed Systems (ICPADS), pp. 291–298 (2010)
Google Scholar

Download references

Author information

Authors and Affiliations

Technion – Israel Institute of Technology, Israel
Uri Verner, Avi Mendelson & Assaf Schuster

Authors

Uri Verner
View author publications
You can also search for this author in PubMed Google Scholar
Avi Mendelson
View author publications
You can also search for this author in PubMed Google Scholar
Assaf Schuster
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Dept. of Electrical Engineering and Computer Science, University of Central Florida, 4000 Central Florida Blvd., P.O. Box 162362, 32816-2362, Orlando, FL, USA
Mainak Chatterjee
Dept. of Computing, Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong
Jian-nong Cao
International Institute of Information Technology, 500 032, Hyderabad, India
Kishore Kothapalli
Instituto de Matemáticas, Universidad Nacional Autonoma de Mexico (UNAM), Ciudad Universitaria, D.F., 04510, Mexico
Sergio Rajsbaum

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Verner, U., Mendelson, A., Schuster, A. (2014). Batch Method for Efficient Resource Sharing in Real-Time Multi-GPU Systems. In: Chatterjee, M., Cao, Jn., Kothapalli, K., Rajsbaum, S. (eds) Distributed Computing and Networking. ICDCN 2014. Lecture Notes in Computer Science, vol 8314. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-45249-9_23

Download citation

DOI: https://doi.org/10.1007/978-3-642-45249-9_23
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-45248-2
Online ISBN: 978-3-642-45249-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics