Springer Nature is making SARS-CoV-2 and COVID-19 research free. View research | View latest news | Sign up for updates

Toward a transparent and efficient GPU cloudification architecture


The cloud model allows the access to a vast amount of computational resources, alleviating the need for acquisition and maintenance costs on a pay-per-use basis. However, other resources, such as (GPUs), have not been fully adapted to this model. Many areas would benefit from suitable cloud solutions based on GPUs: video encoding, sequencing in bioinformatics, scene rendering in remote gaming, or machine learning. Cloud providers offer local and exclusive access to GPUs by using PCI passthrough. This limitation can be overcome by integrating new virtual GPUs (vGPUs) in cloud infrastructures or by providing mechanisms to cloudify existing GPUs, cloudified GPUs (cGPUs), which do not support native virtualization. The proposed architecture enables an effective and transparent integration of cGPUs in public cloud infrastructures. Our solution offers several access modes (local/remote and exclusive/shared) and configures autonomously its components by integrating with the message middleware of the cloud infrastructure. A prototype of the proposed architecture has been evaluated in a real cloud deployment. Experiments assess overhead in the infrastructure and performance of GPU-based applications by considering three different programs: matrix multiplication, sequencing read alignment, and Monte-Carlo on multiple GPUs. Results show that our solution introduces low impact both on the infrastructure and the performance of applications.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16


  1. 1.

    rCUDA v16.11:

  2. 2.

  3. 3.

  4. 4.

  5. 5.

  6. 6.


  1. 1.

    Michael A, Armando F, Rean G, Joseph Anthony D, Randy K, Andy K, Gunho L, David P, Ariel R, Ion S, Matei Z (2010) A view of cloud computing. Commun ACM 53(4):50–58.

  2. 2.

    Mastelic T, Oleksiak A, Claussen H, Brandic I, Pierson J-M, Vasilakos AV (2014) Cloud computing: survey on energy efficiency. ACM Comput Surv 47(2):33:1–33:36. ISSN 0360-0300

  3. 3.

    Mell P, Grance T (2011) The NIST definition of cloud computing. NIST Pubs (800-154).

  4. 4.

    Che S, Li J, Sheaffer JW, Skadron K, Lach J (2008) Accelerating compute-intensive applications with GPUs and FPGAs. In: Symposium on Application Specific Processors, pp 101–107.

  5. 5.

    Rodríguez-Sánchez R, Martínez JL, Fernández-Escribano G, Sánchez JL, Claver JM, Diaz P (2012) Optimizing H.264/AVC interprediction on a GPU-based framework. Concurr Comput Pract Exp 24(14):1607–1624.

  6. 6.

    Yongchao L, Bertil S, Maskell Douglas L (2012) CUSHAW: a CUDA compatible short read aligner to large genomes based on the Burrows–Wheeler transform. Bioinformatics 28(14):1830–1837.

  7. 7.

    Wei C, Ryan S, Chun-Ying H, Kuan-Ta C, Jiangchuan L, Leung Victor CM, Cheng-Hsin H (2016) A survey on cloud gaming—future of computer games. IEEE Access 4:7605–7620.

  8. 8.

    Temam O (2016) Enabling future progress in machine-learning. In: IEEE Symposium on VLSI Circuits, Digest of Technical Papers, pp 1–3.

  9. 9.

    Amazon Web Services: EC2. [Cited 2018-05-25]

  10. 10.

    Microsoft Azure: GPU optimized virtual machine sizes. [Cited 2018-05-25]

  11. 11.

    Google Cloud: GPUs on Compute Engine. [Cited 2018-05-25]

  12. 12.

    NVIDIA GPU Cloud: GPU-Accelerated Containers. [Cited 2018-05-25]

  13. 13.

    Walters JP, Younge AJ, Kang DI, Yao KT, Kang M, Crago SP, Fox G (2014) GPU passthrough performance: a comparison of KVM, Xen, VMWare ESXi, and LXC for CUDA and OpenCL applications. In: IEEE 7th International Conference on Cloud Computing (CLOUD), pp 636–643. IEEE.

  14. 14.

    Amazon EC2 Elastic GPUs. [Cited 2018-05-25]

  15. 15.

    Hong C-H, Spence I, Nikolopoulos DS (2017) GPU virtualization and scheduling methods—a comprehensive survey. ACM Comput Surv 1(1).

  16. 16.

    OpenStack: The Open Source Cloud Operating System. [Cited 2018-05-25]

  17. 17.

    Vogel A, Griebler D, Maron CAF, Schepke C, Fernandes LG (2016) Private IaaS clouds: a comparative analysis of OpenNebula, CloudStack and OpenStack. In: Proceedings of the 24th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing, pp 672–679.

  18. 18.

    Chirivella-Perez E, Gutierrez-Aguado J, Claver JM, Alcaraz-Calero JM (2015) Hybrid and extensible architecture for cloud infrastructure deployment. In: 15th IEEE International Conference on Computer and Information Technology, pp 611–617.

  19. 19.

    Habib I (2008) Virtualization with KVM. Linux J 2008(166). ISSN 1075-3583

  20. 20.

    NVIDIA NVLink Fabric, 2017. [Cited 2018-05-25]

  21. 21.

    NVIDIA. NVIDIA GRID Technology, 2015. [Cited 2018-05-25]

  22. 22.

    Song J, Lv Z, Tian K (2014) KVMGT: a full GPU virtualization solution. [Cited 2018-05-25]

  23. 23.

    Intel Graphics Virtualization Technology (Intel GVT), 2017. [Cited 2018-05-25]

  24. 24.

    Qi Z, Yao J, Zhang C, Yu M, Yang Z, Guan H (2014) VGRIS: virtualized GPU resource isolation and scheduling in cloud gaming. ACM Trans Archit Code Optim 11(2):17:1–17:25. ISSN 1544-3566

  25. 25.

    Liang T-Y, Chang Y-W (2011) GridCUDA: A grid-enabled CUDA programming toolkit. In: 25th IEEE International Conference on Advanced Information Networking and Applications Workshops, pp 141–146.

  26. 26.

    Oikawa M, Kawai A, Nomura K, Yasuoka K, Yoshikawa K, Narumi T (Nov 2012) DS-CUDA: a middleware to use many GPUs in the cloud environment. In: High Performance Computing, Networking, Storage and Analysis (SCC), pp 1207–1214.

  27. 27.

    Shi L, Chen H, Sun J (May 2009) vCUDA: GPU accelerated high performance computing in virtual machines. In: IEEE International Symposium on Parallel Distributed Processing, pp 1–11.

  28. 28.

    Giunta G, Montella R, Agrillo G, Coviello G (2010) A GPGPU transparent virtualization component for high performance computing clouds. In: European Conference on Parallel Processing, pp 379–391. Springer.

  29. 29.

    Reaño Crlos, Silla F, Shainer G, Schultz S (2015) Local and remote GPUs perform similar with EDR 100G InfiniBand. In: 16th International Middleware Conference, Middleware Industry’15, pp 4:1–4:7. ACM. ISBN 978-1-4503-3727-4

  30. 30.

    Reaño C, Silla F (2016) Reducing the performance gap of remote GPU virtualization with infiniband connect-IB. In: 21st IEEE Symposium on Computers and Communications, ISCC’16, pp 920–925.

  31. 31.

    Silla F, Iserte S, Reaño C, Prades J (2017) On the benefits of the remote GPU virtualization mechanism: the rCUDA case. Concurrency and Computation: Practice and Experience, pp e4072–e4089. ISSN 1532-0634

  32. 32.

    Hong CH, Spence I, Nikolopoulos DS (Dec 2017b) Fairgv: fair and fast gpu virtualization. IEEE Trans Parallel Distrib Syst 28(12):3472–3485. ISSN 1045-9219

  33. 33.

    Pérez F, Reaño C, Silla F (2016) Providing CUDA acceleration to KVM virtual machines in infiniband clusters with rCUDA. In: 16th IFIP International Conference on Distributed Applications and Interoperable Systems, DAIS’16, pp 82–95. Springer. ISBN 978-3-319-39577-7

  34. 34.

    Prades J, Reaño C, Silla F (2016) CUDA acceleration for Xen virtual machines in infiniband clusters with rCUDA. In: Proceedings of the 21st ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP’16, pp 35:1–35:2. ACM, New York, NY, USA. ISBN 978-1-4503-4092-2

  35. 35.

    Raffaele M, Giulio G, Giuliano L, Marco L, Carlo P, Carmine F, Valentina P, Cheol-Ho H, Spence Ivor TA, Nikolopoulos Dimitrios S (2017) On the virtualization of CUDA based GPU remoting on ARM and X86 machines in the GVirtuS framework. Int J Parallel Program 45(5):1142–1163.

  36. 36.

    Diab KM, Rafique MM, Hefeeda M (2013) Dynamic sharing of GPUs in cloud systems. In: IEEE Parallel and Distributed Processing Symposium Workshops PhD Forum (IPDPSW), pp 947–954.

  37. 37.

    Jun TJ, Dung VQ, Yoo MH, Kim D, Cho H, Hahm J (2014) GPGPU enabled HPC cloud platform based on OpenStack. In: The International Conference for High Performance Computing, Networking, Storage and Analysis.

  38. 38.

    Iserte S, Clemente-Castelló FJ, Castelló A, Mayo R, Quintana-Ortí ES (2016) Enabling GPU virtualization in cloud environments. In: Proceedings of the 6th International Conference on Cloud Computing and Services Science, pp 249–256.

  39. 39.

    Popa L, Ratnasamy S, Iannaccone G, Krishnamurthy A, Stoica I (2010) A cost comparison of datacenter network architectures. In: Proceedings of the 6th International Conference, Co-NEXT’10, pp 16:1–16:12. New York, NY, USA. ISBN 978-1-4503-0448-1

  40. 40.

    Al-Fares M, Loukissas A, Vahdat A (2008) A scalable, commodity data center network architecture. In: Proceedings of the ACM SIGCOMM 2008 Conference on Data Communication, SIGCOMM’08, pp 63–74. ACM, New York, NY, USA. ISBN 978-1-60558-175-0

  41. 41.

    Calero JMA, Aguado JG (2015) MonPaaS: an adaptive Monitoring Platform as a Service for cloud computing infrastructures and services. IEEE Trans Serv Comput 8(1):65–78. ISSN 1939-1374

  42. 42.

    Lilja David J (2004) Measuring computer performance. A practitioner’s guide. Cambridge University Press, Cambridge

  43. 43.

    Peña AJ, Reaño C, Silla F, Mayo R, Quintana-Ortí ES, Duato J (2014) A complete and efficient CUDA-sharing solution for HPC clusters. Parallel Comput 40(10):574–588. ISSN 0167-8191

Download references

Author information

Correspondence to Raúl Peña-Ortiz.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Gutiérrez-Aguado, J., Claver, J.M. & Peña-Ortiz, R. Toward a transparent and efficient GPU cloudification architecture. J Supercomput 75, 3640–3672 (2019).

Download citation


  • Cloud computing
  • Distributed computing
  • Platform virtualization
  • Computer network management
  • GPU cloudification