Hybrid Parallelization of Particle in Cell Monte Carlo Collision (PIC-MCC) Algorithm for Simulation of Low Temperature Plasmas
We illustrate the parallelization of PIC code, for kinetic simulation of Low Temperature Plasmas, on Intel Multicore (Xeon) and Manycore (Xeon Phi) architectures, and subsequently on a HPC cluster. The implementation of 2D-3v PIC-MCC algorithm described in the paper involves computational solution of Vlassov-Poisson equations, which provides the spatial and temporal evolution of charged-particle velocity distribution functions in plasmas under the effect of self-consistent electromagnetic fields and collisions. Stringent numerical constraints on total number of particles, number of grid points and simulation time-scale associated with PIC codes makes it computationally prohibitive on CPUs (serial code) in case of large problem sizes. We first describe a shared memory parallelization technique using OpenMP library and then propose a hybrid parallel scheme (OpenMP+MPI) consisting of a distributed memory system. OpenMP based PIC code has been executed on Xeon processor and Xeon-Phi co-processors (Knights Corner and Knights Landing) and we compare our results against a serial implementation on Intel core i5 processor. Finally, we compare the results of the hybrid parallel code with the OpenMP based parallel code. Hybrid strategy based on OpenMP and MPI, involving a three-level parallelization (instruction-level, thread-level over many cores and node-level across a cluster of Xeon processors), achieves a linear speedup on an HPC cluster with 4 nodes (total 64 cores). The results show that our particle decomposition based hybrid parallelization technique using private grids scale efficiently with increasing problem size and number of cores in the cluster.
The work has been carried out using the HPC facilities at DA-IICT and hardware received under BRNS-PFRC project. We would also like to acknowledge the help received from Colfax’s remote access program. We thank Siddarth Kamaria, Harshil Shah and Riddhesh Markandeya for their contribution towards the serial code development. Miral Shah thanks Department of Atomic Energy, Govt. of India for junior research fellowship (JRF) received under BRNS-PFRC project (No. 39/27/2015-BRNS/34081).
- 4.Shah, H., Kamaria, S., Markandeya, R., Shah, M., Chaudhury, B.: A novel implementation of 2D3V particle-in-cell (PIC) algorithm for Kepler GPU architecture. In: IEEE 24th International Conference on High Performance Computing (HiPC), pp. 378–387 (2017)Google Scholar
- 9.Carmona, E.A., Chandler, L.J.: On parallel PIC versatility and the structure of parallel PIC approaches. Concurr. Comput.: Pract. Exp. 9, 1377–1405 (1997)Google Scholar
- 10.Adams, M.F., Ethier, S., Wichmann, N.: Performance of particle in cell methods on highly concurrent computational architectures. J. Phys.: Conf. Ser. 78, 012001 (2007)Google Scholar
- 18.Boris, J.P.: Relativistic plasma simulation-optimization. In: 4th Conference on Numerical Simulation of Plasma, no. November 1970, p. 3 (1970)Google Scholar
- 22.Rabenseifner, R.: Hybrid parallel programming on HPC platforms. In: Fifth European Workshop on OpenMP, EWOMP 2003, Aachen, Germany, 22–26 September 2003 (2003)Google Scholar
- 24.Hoefler, T., Belli, R.: Scientific benchmarking of parallel computing systems: twelve ways to tell the masses when reporting performance results. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, p. 73 (2015)Google Scholar