Skip to main content
Log in

LICOM3-CUDA: a GPU version of LASG/IAP climate system ocean model version 3 based on CUDA

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

The ocean general circulation model (OGCM) is an essential tool for researching oceanography and atmospheric science. The LASG/IAP climate system ocean model version 3 (LICOM3) is a parallel version of the OGCM. Our goal is to implement and optimize a GPU version of LICOM3 based on compute unified device architecture (CUDA) called LICOM3-CUDA. Considering the characteristics of LICOM3 and CUDA, we design and implement some pivotal optimization methods, including redesigning the numerical algorithms of complicated functions, decoupling data dependency, avoiding memory write conflicts, and optimizing communication. In this paper, we selected two experiments, including 1\(^{\circ }\) (small-scale) and 0.1\(^{\circ }\) (large-scale) resolutions to evaluate the performance of LICOM3-CUDA. Under the experimental environment of two Intel Xeon Gold 6148 CPUs and four NVIDIA Quadro GV100s, the LICOM3-CUDA (1\(^{\circ }\)) achieves a simulation speed of 114.3 simulation-year-per-day (SYPD). Compare with the performance of LICOM3, the LICOM3-CUDA can run much faster with 6.5 times, and the compute-intensive module achieves over 70\(\times\) speedup. In addition, the energy consumption for the simulation year is reduced by 41.3%. As for high-resolution and large-scale simulation, the number of GPUs increased from 96 to 1536 as well as the LICOM3-CUDA (0.1\(^{\circ }\)) time consumption decreased from 3261 to 720 seconds with approximately 4.5\(\times\) of speedup.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Data and code availability

the model code (LICOM3-CUDA v1.0) along with the paper data, dataset and a 100 km (1\(^{\circ }\)) case can be downloaded from the website https://zenodo.org/record/7440403 (last access: 15 December 2022) [42].

References

  1. Lazo JK, Lawson M, Larsen PH, Waldman DM (2011) U.S. economic sensitivity to weather variability. Bull Am Meteorol Soc 92(6):709–720. https://doi.org/10.1175/2011BAMS2928.1

    Article  Google Scholar 

  2. Schär C, Fuhrer O, Arteaga A, Ban N, Charpilloz C, Girolamo SD, Hentgen L, Hoefler T, Lapillonne X, Leutwyler D, Osterried K, Panosetti D, Rüdisühli S, Schlemmer L, Schulthess TC, Sprenger M, Ubbiali S, Wernli H (2020) Kilometer-scale climate models: prospects and challenges. Bull Am Meteorol Soc 101(5):567–587. https://doi.org/10.1175/BAMS-D-18-0167.1

    Article  Google Scholar 

  3. Khan HN, Hounshell DA, Fuchs ER (2018) Science and research policy at the end of moore’s law (vol 1, pg 14, 2018). Nat Electron 1(2):146–146. https://doi.org/10.1038/s41928-017-0005-9

    Article  Google Scholar 

  4. Frank DJ, Dennard RH, Nowak E, Solomon PM, Taur Y, Wong H-SP (2001) Device scaling limits of si mosfets and their application dependencies. Proc IEEE 89(3):259–288. https://doi.org/10.1109/5.915374

    Article  Google Scholar 

  5. Bauer P, Dueben PD, Hoefler T, Quintino T, Schulthess TC, Wedi NP (2021) The digital revolution of earth-system science. Nat Comput Sci 1(2):104–113. https://doi.org/10.1038/s43588-021-00023-0

    Article  Google Scholar 

  6. Michalakes J, Vachharajani M (2008) Gpu acceleration of numerical weather prediction. Parallel Process Lett 18(04):531–548. https://doi.org/10.1142/S0129626408003557

    Article  MathSciNet  Google Scholar 

  7. Wang Y, Jiang J, Zhang H, Dong X, Wang L, Ranjan R, Zomaya AY (2017) A scalable parallel algorithm for atmospheric general circulation models on a multi-core cluster. Futur Gener Comput Syst 72:1–10. https://doi.org/10.1016/j.future.2017.02.008

    Article  Google Scholar 

  8. TOP 500 NOVEMBER 2021. https://www.top500.org/lists/top500/2021/11/ Accessed 23 January 2022

  9. Zhao W-L, Wang W, Wang Q (2022) Optimization of cosmological n-body simulation with fmm-pm on simt accelerators. J Supercomput 78(5):7186–7205. https://doi.org/10.1007/s11227-021-04153-0

    Article  Google Scholar 

  10. Sojoodi AH, Salimi Beni M, Khunjush F (2021) Ignite-gpu: a gpu-enabled in-memory computing architecture on clusters. J Supercomput 77(3):3165–3192

    Article  Google Scholar 

  11. Rani S, Gupta O (2017) Clus_gpu-blastp: accelerated protein sequence alignment using gpu-enabled cluster. J Supercomput 73(10):4580–4595

    Article  Google Scholar 

  12. Bleichrodt F, Bisseling RH, Dijkstra HA (2012) Accelerating a barotropic ocean model using a gpu. Ocean Model 41:16–21

    Article  Google Scholar 

  13. Chen B, Zhu J, Li L (2012) Accelerating 3d ocean model development by using gpu computing. In: Deng W (ed) Futur Control Autom. Springer, Berlin, Heidelberg, pp 37–43

    Chapter  Google Scholar 

  14. Yamagishi T, Matsumura Y (2016) Gpu acceleration of a non-hydrostatic ocean model with a multigrid poisson/helmholtz solver. Procedia Computer Science 80:1658–1669. https://doi.org/10.1016/j.procs.2016.05.502. International Conference on Computational Science 2016, ICCS 2016, 6-8 June 2016, San Diego, California, USA

  15. Zhao X-d, Liang S-x, Sun Z-c, Zhao X-z, Sun J-w, Liu Z-b (2017) A gpu accelerated finite volume coastal ocean model. J Hydrodyn, Ser. B 29(4):679–690. https://doi.org/10.1016/S1001-6058(16)60780-1

    Article  Google Scholar 

  16. Panzer I, Lines S, Mak J, Choboter P, Lupo C (2013) High performance regional ocean modeling with gpu acceleration. In: 2013 OCEANS - San Diego, 1–4. https://doi.org/10.23919/OCEANS.2013.6741366

  17. Mak J, Choboter P, Lupo C (2011) Numerical ocean modeling and simulation with cuda. In: OCEANS’11 MTS/IEEE KONA, 1–6. https://doi.org/10.23919/OCEANS.2011.6107199

  18. Xu S, Huang X, Oey L-Y, Xu F, Fu H, Zhang Y, Yang G (2015) Pom.gpu-v1.0: a gpu-based princeton ocean model. Geosci Model Dev 8(9):2815–2827. https://doi.org/10.5194/gmd-8-2815-2015

    Article  Google Scholar 

  19. Jiang J, Lin P, Wang J, Liu H, Chi X, Hao H, Wang Y, Wang W, Zhang L (2019) Porting lasg/ iap climate system ocean model to gpus using openacc. IEEE Access 7:154490–154501. https://doi.org/10.1109/ACCESS.2019.2932443

    Article  Google Scholar 

  20. Wang P, Jiang J, Lin P, Ding M, Wei J, Zhang F, Zhao L, Li Y, Yu Z, Zheng W, Yu Y, Chi X, Liu H (2021) The gpu version of lasg/iap climate system ocean model version 3 (licom3) under the heterogeneous-compute interface for portability (hip) framework and its large-scale application. Geosci Model Dev 14(5):2781–2799. https://doi.org/10.5194/gmd-14-2781-2021

    Article  Google Scholar 

  21. Xuehong Z, Xinzhong L (1989) A numerical world ocean general circulation model. Adv Atmos Sci 6(1):44–61. https://doi.org/10.1007/BF02656917

    Article  Google Scholar 

  22. Liu H, Lin P, Yu Y, Zhang X (2012) The baseline evaluation of lasg/iap climate system ocean model (licom) version 2. Acta Meteorol Sin 26(3):318–329. https://doi.org/10.1007/s13351-012-0305-y

    Article  Google Scholar 

  23. Madec G, Imbard M (1996) A global ocean mesh to overcome the north pole singularity. Climate Dyn 12(6):381–388. https://doi.org/10.1007/BF00211684

    Article  Google Scholar 

  24. Murray RJ (1996) Explicit generation of orthogonal grids for ocean models. J Comput Phys 126(2):251–273. https://doi.org/10.1006/jcph.1996.0136

    Article  MATH  Google Scholar 

  25. St LL, Simmons H, Jayne S (2002) Estimates of tidally driven enhanced mixing in the deep ocean. Geophys Res Lett 29:2106. https://doi.org/10.1029/2002GL015633

    Article  Google Scholar 

  26. Ferreira D, Marshall J, Heimbach P (2005) The annual cycle of the global ocean circulation as determined by 4d-var data assimilation. JPO 35:1891–1910. https://doi.org/10.1175/JPO2785.1

    Article  Google Scholar 

  27. Lin P, Liu H, Xue W, Li H, Jiang J, Song M, Song Y, Wang F, Zhang M (2016) A coupled experiment with licom2 as the ocean component of cesm1. J Meteorol Res 30(1):76–92. https://doi.org/10.1007/s13351-015-5045-3

    Article  Google Scholar 

  28. McCartney MS, Talley LD (1982) The subpolar mode water of the north Atlantic ocean. J Phys Oceanogr 12(11):1169–1188. https://doi.org/10.1175/1520-0485(1982)012$<$1169:TSMWOT$>$2.0.CO;2

    Article  Google Scholar 

  29. Gent PR, Mcwilliams JC (1990) Isopycnal mixing in ocean circulation models. J Phys Oceanogr 20(1):150–155. https://doi.org/10.1175/1520-0485(1990)020$<$0150:IMIOCM$>$2.0.CO;2

    Article  Google Scholar 

  30. Lin P, Yu Z, Liu H, Yu Y, Li Y, Jiang J, Xue W, Chen K, Yang Q, Zhao B et al (2020) Licom model datasets for the cmip6 ocean model intercomparison project. Adv Atmos Sci 37(3):239–249. https://doi.org/10.1007/s00376-019-9208-5

    Article  Google Scholar 

  31. Li Y, Liu H, Ding M, Lin P, Yu Z, Yu Y, Meng Y, Li Y, Jian X, Jiang J et al (2020) Eddy-resolving simulation of cas-licom3 for phase 2 of the ocean model intercomparison project. Adv Atmos Sci 37(10):1067–1080. https://doi.org/10.1007/s00376-020-0057-z

    Article  Google Scholar 

  32. Zhang (2020) Cas-esm 2: description and climate simulation performance of the chinese academy of sciences (cas) earth system model (esm) version 2. J Adv Model Earth Syst. https://doi.org/10.1029/2020MS002210

  33. Wang T, Jiang J, Zhang M, Zhang H, He J, Hao H, Chi X (2020) Design and research of cas-cig for earth system models. Earth and Space Science 7(7):2019–000965 https://agupubs.onlinelibrary.wiley.com/doi/pdf/10.1029/2019EA000965. https://doi.org/10.1029/2019EA000965. e2019EA000965 2019EA000965

  34. Liu H, Lin P, Zheng W, Luan Y, Ma J, Ding M, Mo H, Wan L, Ling T (2021) A global eddy-resolving ocean forecast system in china - licom forecast system (lfs). J Oper Oceanogr. https://doi.org/10.1080/1755876X.2021.1902680

    Article  Google Scholar 

  35. Kerbyson DJ, Jones PW (2005) A performance model of the parallel ocean program. Int J High Perform Comput Appl 19(3):261–276

    Article  Google Scholar 

  36. NVIDIA: CUDA C Programming Guide. https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html Accessed 15 August 2021

  37. Henderson T, Middlecoff J, Rosinski J, Govett M, Madden P (2011) Experience applying fortran gpu compilers to numerical weather prediction. In: 2011 Symposium on Application Accelerators in High-Performance Computing. 34–41 https://doi.org/10.1109/SAAHPC.2011.9

  38. AMD: HIP Programming Guide. https://rocmdocs.amd.com/en/latest/Programming_Guides/HIP-GUIDE.html Accessed 15 August 2021

  39. Harris M (2021) How to Optimize Data Transfers in CUDA C/C++. https://developer.nvidia.com/blog/how-optimize-data-transfers-cuda-cc/ Accessed 15 August

  40. Rucong Y (1994) A two-step shape-preserving advection scheme. Adv Atmos Sci 11(4):479–490

    Article  Google Scholar 

  41. NVIDIA (2021): NVIDIA Collective Communication Library (NCCL) Documentation. https://docs.nvidia.com/deeplearning/nccl/user-guide/docs/index.html Accessed 15 August

  42. JunlinWei, Jiang J, Liu H, Zhang F, Lin P, Wang P, Yu Y, Chi X, Zhao L, Ding M, Li Y, Yu Z, Zheng W, Wang Y (2022) LICOM3-CUDA: a GPU Version of LASG/IAP Climate System Ocean Model Version 3 Based on CUDA. https://doi.org/10.5281/zenodo.7440403

  43. Large WG, Yeager SG (2009) The global climatology of an interannually varying air–sea flux data set. Climate Dynamics 33(2-3):341–364. https://doi.org/10.1007/s00382-008-0441-3

    Article  Google Scholar 

  44. Redi MH (1982) Oceanic isopycnal mixing by coordinate rotation. J Phys Oceanogr 12(10):1154–1158

    Article  Google Scholar 

  45. Fox-Kemper B, Menemenlis D (2008) Can large eddy simulation techniques improve mesoscale rich ocean models? In: Hecht W, Hasumi H (eds) Ocean modeling in an eddying regime. Geophysical Monograph Series, vol 177, American Geophysical Union, Washington DC, pp 319–337. https://doi.org/10.1029/177GM19

    Chapter  Google Scholar 

Download references

Acknowledgements

The study is funded by the National Natural Sciences Foundation (41931183), the National Key Research and Development Program (2016YFB0200800), and the “Earth System Science Numerical Simulator Facility” (EarthLab).

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Jinrong Jiang or Hailong Liu.

Ethics declarations

Conflict of interest

The authors declare no conflict of interest. The founding sponsors had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wei, J., Jiang, J., Liu, H. et al. LICOM3-CUDA: a GPU version of LASG/IAP climate system ocean model version 3 based on CUDA. J Supercomput 79, 9604–9634 (2023). https://doi.org/10.1007/s11227-022-05020-2

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-022-05020-2

Keywords

Navigation