Multi-GPU and Multi-CPU Parallelization for Interactive Physics Simulations

  • Everton Hermann
  • Bruno Raffin
  • François Faure
  • Thierry Gautier
  • Jérémie Allard
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6272)


Today, it is possible to associate multiple CPUs and multiple GPUs in a single shared memory architecture. Using these resources efficiently in a seamless way is a challenging issue. In this paper, we propose a parallelization scheme for dynamically balancing work load between multiple CPUs and GPUs. Most tasks have a CPU and GPU implementation, so they can be executed on any processing unit. We rely on a two level scheduling associating a traditional task graph partitioning and a work stealing guided by processor affinity and heterogeneity. These criteria are intended to limit inefficient task migrations between GPUs, the cost of memory transfers being high, and to favor mapping small tasks on CPUs and large ones on GPUs to take advantage of heterogeneity. This scheme has been implemented to support the SOFA physics simulation engine. Experiments show that we can reach speedups of 22 with 4 GPUs and 29 with 4 CPU cores and 4 GPUs. CPUs unload GPUs from small tasks making these GPUs more efficient, leading to a “cooperative speedup” greater than the sum of the speedups separatly obtained on 4 GPUs and 4 CPUs.


Collision Detection Task Graph Time Integration Step Memory Transfer Small Task 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Allard, J., Cotin, S., Faure, F., Bensoussan, P.J., Poyer, F., Duriez, C., Delingette, H., Grisoni, L.: Sofa - an open source framework for medical simulation. In: Medicine Meets Virtual Reality (MMVR’15), Long Beach, USA (2007)Google Scholar
  2. 2.
    Faure, F., Barbier, S., Allard, J., Falipou, F.: Image-based collision detection and response between arbitrary volume objects. In: Symposium on Computer Animation (SCA 2008), pp. 155–162. Eurographics, Switzerland (2008)Google Scholar
  3. 3.
    NVIDIA Corporation: NVIDIA CUDA compute unified device architecture programming guide (2007)Google Scholar
  4. 4.
    Frigo, M., Leiserson, C.E., Randall, K.H.: The implementation of the cilk-5 multithreaded language. SIGPLAN Not. 33(5), 212–223 (1998)CrossRefGoogle Scholar
  5. 5.
    Allard, J., Raffin, B.: Distributed physical based simulations for large vr applications. In: Virtual Reality Conference, 2006, pp. 89–96 (2006)Google Scholar
  6. 6.
    Gutiérrez, E., Romero, S., Romero, L.F., Plata, O., Zapata, E.L.: Parallel techniques in irregular codes: cloth simulation as case of study. J. Parallel Distrib. Comput. 65(4), 424–436 (2005)CrossRefzbMATHGoogle Scholar
  7. 7.
    Thomaszewski, B., Pabst, S., Blochinger, W.: Parallel techniques for physically based simulation on multi-core processor architectures. Computers & Graphics 32(1), 25–40 (2008)CrossRefGoogle Scholar
  8. 8.
    Hermann, E., Raffin, B., Faure, F.: Interactive physical simulation on multicore architectures. In: EGPGV, Munich (2009)Google Scholar
  9. 9.
    Georgii, J., Echtler, F., Westermann, R.: Interactive simulation of deformable bodies on gpus. In: Proceedings of Simulation and Visualisation, pp. 247–258 (2005)Google Scholar
  10. 10.
    Comas, O., Taylor, Z.A., Allard, J., Ourselin, S., Cotin, S., Passenger, J.: Efficient Nonlinear FEM for Soft Tissue Modelling and its GPU Implementation within the Open Source Framework SOFA. In: Bello, F., Edwards, E. (eds.) ISBMS 2008. LNCS, vol. 5104, pp. 28–39. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  11. 11.
    Harris, M.J., Coombe, G., Scheuermann, T., Lastra, A.: Physically-based visual simulation on graphics hardware. In: HWWS 2002: Proceedings of the ACM SIGGRAPH/EUROGRAPHICS conference on Graphics hardware, pp. 109–118. Eurographics Association, Aire-la-Ville (2002)Google Scholar
  12. 12.
    Leung, A., Lhoták, O., Lashari, G.: Automatic parallelization for graphics processing units. In: PPPJ 2009, pp. 91–100. ACM, New York (2009)Google Scholar
  13. 13.
    Dolbeau, R., Bihan, S., Bodin, F.: Hmpp: A hybrid multi-core parallel programming environment. In: First Workshop on General Purpose Processing on Graphics Processing Unit (2007)Google Scholar
  14. 14.
    Augonnet, C., Thibault, S., Namyst, R., Wacrenier, P.A.: StarPU: A Unified Platform for Task Scheduling on Heterogeneous Multicore Architectures. In: Sips, H., Epema, D., Lin, H.-X. (eds.) Euro-Par 2009 Parallel Processing. LNCS, vol. 5704, pp. 863–874. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  15. 15.
    Ayguadé, E., Badia, R.M., Igual, F.D., Labarta, J., Mayo, R., Quintana-Ortí, E.S.: An extension of the starss programming model for platforms with multiple gpus. In: Sips, H., Epema, D., Lin, H.-X. (eds.) Euro-Par 2009 Parallel Processing. LNCS, vol. 5704, pp. 851–862. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  16. 16.
    Zhou, K., Hou, Q., Ren, Z., Gong, M., Sun, X., Guo, B.: Renderants: interactive reyes rendering on gpus. ACM Trans. Graph. 28(5), 1–11 (2009)CrossRefGoogle Scholar
  17. 17.
    Gautier, T., Besseron, X., Pigeon, L.: KAAPI: a thread scheduling runtime system for data flow computations on cluster of multi-processors. In: Parallel Symbolic Computation 2007 (PASCO 2007), London, Ontario, Canada, pp. 15–23. ACM, New York (2007)Google Scholar
  18. 18.
    Bender, M.A., Rabin, M.O.: Online Scheduling of Parallel Programs on Heterogeneous Systems with Applications to Cilk. Theory of Computing Systems 35(3), 289–304 (2000)MathSciNetCrossRefzbMATHGoogle Scholar
  19. 19.
    Gautier, T., Roch, J.L., Wagner, F.: Fine grain distributed implementation of a dataflow language with provable performances. In: PAPP Workshop, Beijing, China. IEEE, Los Alamitos (2007)Google Scholar
  20. 20.
    Acar, U.A., Blelloch, G.E., Blumofe, R.D.: The data locality of work stealing. In: SPAA, pp. 1–12. ACM, New York (2000)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Everton Hermann
    • 1
  • Bruno Raffin
    • 1
  • François Faure
    • 2
  • Thierry Gautier
    • 1
  • Jérémie Allard
    • 1
  1. 1.INRIAFrance
  2. 2.Grenoble UniversityFrance

Personalised recommendations