Skip to main content

Optimize Memory Usage in Vector Particle-In-Cell (VPIC) to Break the 10 Trillion Particle Barrier in Plasma Simulations

Part of the Lecture Notes in Computer Science book series (LNTCS,volume 12743)

Abstract

Vector Particle-In-Cell (VPIC) is one of the fastest plasma simulation codes in the world, with particle numbers ranging from one trillion on the first petascale system, Roadrunner, to ten trillion particles on the more recent Blue Waters supercomputer. As supercomputers continue to grow rapidly in size, so too does the gap between compute capability and memory capability. Current memory systems limit VPIC simulations greatly as the maximum number of particles that can be simulated directly depends on the available memory. In this study, we present a suite of VPIC memory optimizations (i.e., particle weight, half-precision, and fixed-point optimizations) that enable a significant increase in the number of particles in VPIC simulations. We assess the optimizations’ impact on a GPU-accelerated Power9 system. Our optimizations enable a 31.25% reduction in memory usage and up to 40% increase in the number of particles.

Keywords

  • Particle-In-Cell method
  • Mixed-precision
  • Fixed-point
  • Plasma physics

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-030-77964-1_35
  • Chapter length: 14 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   79.99
Price excludes VAT (USA)
  • ISBN: 978-3-030-77964-1
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   99.99
Price excludes VAT (USA)
Fig. 1.
Fig. 2.
Fig. 3.
Fig. 4.
Fig. 5.
Fig. 6.
Fig. 7.
Fig. 8.
Fig. 9.
Fig. 10.

References

  1. Arber, T., et al.: Contemporary particle-in-cell approach to laser-plasma modelling. Plasma Phys. Control. Fus. 57(11), 113001 (2015)

    CrossRef  Google Scholar 

  2. Bowers, K.J., Albright, B., Yin, L., Bergen, B., Kwan, T.: Ultrahigh performance three-dimensional electromagnetic relativistic kinetic plasma simulation. Phys. Plasmas 15(5), 055703 (2008)

    CrossRef  Google Scholar 

  3. Bowers, K.J., Albright, B.J., Bergen, B., Yin, L., Barker, K.J., Kerbyson, D.J.: 0.374 pflop/s trillion-particle kinetic modeling of laser plasma interaction on roadrunner. In: SC 2008: Proceedings of the 2008 ACM/IEEE Conference on Supercomputing, pp. 1–11. IEEE (2008)

    Google Scholar 

  4. Burau, H., et al.: PIConGPU: a fully relativistic particle-in-cell code for a GPU cluster. IEEE Trans. Plasma Sci. 38(10), 2831–2839 (2010)

    CrossRef  Google Scholar 

  5. Byna, S., Sisneros, R., Chadalavada, K., Koziol, Q.: Tuning parallel I/O on blue waters for writing 10 trillion particles. Cray User Group (CUG) (2015)

    Google Scholar 

  6. Byna, S., et al.: Parallel I/O, analysis, and visualization of a trillion particle simulation. In: SC 2012: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, pp. 1–12. IEEE (2012)

    Google Scholar 

  7. Catrina, O., Saxena, A.: Secure computation with fixed-point numbers. In: Sion, R. (ed.) FC 2010. LNCS, vol. 6052, pp. 35–50. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-14577-3_6

    CrossRef  Google Scholar 

  8. Chandrasekaran, S., et al.: Running PIConGPU on summit: CAAR: preparing PIConGPU for frontier at ORNL. In: 4th OpenPOWER Academia Discussion Group Workshop (2019)

    Google Scholar 

  9. Chen, G., Chacón, L., Yin, L., Albright, B.J., Stark, D.J., Bird, R.F.: A semi-implicit, energy-and charge-conserving particle-in-cell algorithm for the relativistic Vlasov-Maxwell equations. J. Comput. Phys. 407, 109228 (2020)

    MathSciNet  CrossRef  Google Scholar 

  10. Choquette, J., Gandhi, W.: Nvidia A100 GPU: Performance & innovation for GPU computing. In: 2020 IEEE Hot Chips 32 Symposium (HCS), pp. 1–43. IEEE Computer Society (2020)

    Google Scholar 

  11. Dawson, J.M.: Particle simulation of plasmas. Rev. Modern Phys. 55(2), 403 (1983)

    CrossRef  Google Scholar 

  12. Edwards, H.C., Trott, C.R., Sunderland, D.: Kokkos: enabling manycore performance portability through polymorphic memory access patterns. J. Parallel Distrib. Comput. 74(12), 3202–3216 (2014). https://doi.org/10.1016/j.jpdc.2014.07.003. http://www.sciencedirect.com/science/article/pii/S0743731514001257. Domain-Specific Languages and High-Level Frameworks for High-Performance Computing

  13. Fonseca, R.A., et al.: OSIRIS: a three-dimensional, fully relativistic particle in cell code for modeling plasma based accelerators. In: Sloot, P.M.A., Hoekstra, A.G., Tan, C.J.K., Dongarra, J.J. (eds.) ICCS 2002. LNCS, vol. 2331, pp. 342–351. Springer, Heidelberg (2002). https://doi.org/10.1007/3-540-47789-6_36

    CrossRef  Google Scholar 

  14. Fonseca, R.A., et al.: Exploiting multi-scale parallelism for large scale numerical modelling of laser wakefield accelerators. Plasma Phys. Control. Fus. 55(12), 124011 (2013)

    CrossRef  Google Scholar 

  15. Fried, B.D.: Mechanism for instability of transverse plasma waves. Phys. Fluids 2(3), 337–337 (1959)

    CrossRef  Google Scholar 

  16. Goldberg, D.: What every computer scientist should know about floating-point arithmetic. ACM Comput. Surv. (CSUR) 23(1), 5–48 (1991)

    CrossRef  Google Scholar 

  17. Kalamkar, D., et al.: A study of bfloat16 for deep learning training. arXiv preprint arXiv:1905.12322 (2019)

  18. Li, A., Song, S.L., Chen, J., Li, J., Liu, X., Tallent, N.R., Barker, K.J.: Evaluating modern GPU interconnect: PCIe, NVLink, NV-SLI, NVSwitch and GPUDirect. IEEE Trans. Parallel Distrib. Syst. 31(1), 94–110 (2019)

    CrossRef  Google Scholar 

  19. Morse, R., Nielson, C.: Numerical simulation of the Weibel instability in one and two dimensions. Phys. Fluids 14(4), 830–840 (1971)

    CrossRef  Google Scholar 

  20. NVIDIA Corporation: Nvidia A100 tensor core GPU architecture. Technical report (2020). https://www.nvidia.com/content/dam/en-zz/Solutions/Data-Center/nvidia-ampere-architecture-whitepaper.pdf

  21. Stix, T.H.: Waves in plasmas. Springer (1992)

    Google Scholar 

  22. Thode, L., Sudan, R.: Two-stream instability heating of plasmas by relativistic electron beams. Phys. Rev. Lett. 30(16), 732 (1973)

    CrossRef  Google Scholar 

  23. Vay, J.L., et al.: Warp-X: a new exascale computing platform for beam-plasma simulations. Nucl. Instrum. Methods Phys. Res. Sect. A Acceler. Spectr. Detect. Assoc. Equip. 909, 476–479 (2018)

    CrossRef  Google Scholar 

Download references

Acknowledgments

Work performed under the auspices of the U.S. DOE by Triad National Security, LLC, and Los Alamos National Laboratory (LANL). This work was supported the LANL ASC and Experimental Sciences programs. The UTK authors acknowledge the support of LANL under contract #578735 and IBM through a Shared University Research Award. LA-UR-21-21297.

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Verify currency and authenticity via CrossMark

Cite this paper

Tan, N., Bird, R., Chen, G., Taufer, M. (2021). Optimize Memory Usage in Vector Particle-In-Cell (VPIC) to Break the 10 Trillion Particle Barrier in Plasma Simulations. In: Paszynski, M., Kranzlmüller, D., Krzhizhanovskaya, V.V., Dongarra, J.J., Sloot, P.M.A. (eds) Computational Science – ICCS 2021. ICCS 2021. Lecture Notes in Computer Science(), vol 12743. Springer, Cham. https://doi.org/10.1007/978-3-030-77964-1_35

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-77964-1_35

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-77963-4

  • Online ISBN: 978-3-030-77964-1

  • eBook Packages: Computer ScienceComputer Science (R0)