Matrix Multiplication in Multiphysics Systems Using CUDA

Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 240)


Multiphysics systems are used to simulate various physics phenomena given by Partial Differential Equations (PDEs). The most popular method of solving PDEs is Finite Element method. The simulations require large amount of computational power, that is mostly caused by extensive processing of matrices. The high computational requirements have led recently to parallelization of algorithms and to utilization of Graphic Processing Units (GPUs). To take advantage of GPUs, one of GPU programming models has to be used. In this paper, CUDA model developed by nVidia is used to implement two parallel matrix multiplication algorithms. To evaluate the effectiveness of these algorithms, several experiments have been performed. Results have been compared with results obtained by classic Central Processing Unit (CPU) matrix multiplication algorithm. The comparison shows that matrix multiplication on GPU significantly outperforms classic CPU approach.


CUDA Matrix Multiplication Multiphysics Simulations libMesh 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Krol, D., Zydek, D.: Solving PDEs in Modern Multiphysics Simulation Software. In: 2013 IEEE International Conference on Electro/Information Technology (EIT 2013), pp. 1–6 (2013)Google Scholar
  2. 2.
  3. 3.
    Liu, B., Zydek, D., Selvaraj, H., Gewali, L.: Accelerating High Performance Computing Applications Using CPUs, GPUs, Hybrid CPU/GPU, and FPGAs. In: 2012 13th Inter. Conf. on Parallel and Distributed Computing, Applications and Technologies (PDCAT 2012), pp. 337–342 (2012), doi:10.1109/PDCAT.2012.34Google Scholar
  4. 4.
    Zydek, D., Selvaraj, H., Gewali, L.: Synthesis of Processor Allocator for Torus-Based Chip MultiProcessors. In: 7th Inter. Conf. on Information Technology: New Generations (ITNG 2010), pp. 13–18 (2010), doi:10.1109/ITNG.2010.145Google Scholar
  5. 5.
    Zydek, D., Chmaj, G., Chiu, S.: Modeling Computational Limitations in H-Phy and Overlay-NoC Architectures. The Journal of Supercomputing (2013), doi:10.1007/s11227-013-0932-9Google Scholar
  6. 6.
    Chmaj, G., Zydek, D.: Software Development Approach for Discrete Simulators. In: 21st International Conference on Systems Engineering (ICSEng 2011), pp. 273–278 (2011), doi:10.1109/ICSEng.2011.56Google Scholar
  7. 7.
    Nvidia: CUDA Programming Guide 2.0. Technical report, Nvidia (2009)Google Scholar
  8. 8.
  9. 9.
    Ryoo, S., et al.: Optimization Principles and Application Performance Evaluation of a Multithreaded GPU using CUDA. In: 13th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pp. 73–82 (2008)Google Scholar
  10. 10.
    Cecilia, J.M., et al.: The GPU on the simulation of cellular computing models. Soft Computing 16(2), 231–246 (2012)CrossRefGoogle Scholar
  11. 11.
    Fatahalian, K., Sugerman, J., Hanrahan, P.: Understanding the efficiency of GPU algorithms for matrix-matrix multiplication. In: ACM SIGGRAPH/EUROGRAPHICS Conference on Graphics Hardware, pp. 133–137 (2004)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  1. 1.Department of Electrical EngineeringIdaho State UniversityPocatelloUSA
  2. 2.Department of Electrical and Computer EngineeringUniversity of NevadaLas VegasUSA

Personalised recommendations