Skip to main content

GPU Porting of Scalable Implicit Solver with Green’s Function-Based Neural Networks by OpenACC

  • Conference paper
  • First Online:
Accelerator Programming Using Directives (WACCPD 2021)

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 13194))

Included in the following conference series:

  • 379 Accesses

Abstract

With the development of diverse computer architectures and diverse HPC applications, it is desirable to make performance portable applications that run on multiple architectures with relatively low development cost. Directive based programming models such as OpenACC have been developed for such purpose, and have been used successfully to port many equation-based HPC applications. As an example of porting of a class of HPC applications comprising both data-analytics methods and equation-based methods, we port an implicit solver with a neural network (NN)-type preconditioner for solving large-scale partial differential equation (PDE)-based problems. The scalable preconditioner is based on the Green’s functions reflecting properties of the target PDE, which improves the accuracy and efficiency of using NNs for solving PDE-based problems. By kernel algorithm design suitable for the computer architecture and use of OpenACC, we enabled high performance on recent GPUs with relatively low development cost. Here, 64.4% of FP64 peak was obtained on NVIDIA A100 GPU-equipped nodes of AI Bridging Cloud Infrastructure at National Institute of Advanced Industrial Science and Technology, leading to 2.54-fold speedup from a highly-tuned GPU implementation of a widely used PDE solver algorithm and 38.9-fold speedup from OpenMP-based CPU implementation running on the same system. Furthermore, 83.4% weak scalability was obtained from 8 to 256 A100 GPUs on the same system, enabling solving large scale problems of up to 25.7 billion degrees-of-freedom with high performance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 44.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 59.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    Since implicit data transfer between CPU and GPU is conducted every time in acc parallel/kernels region by default in OpenACC, all necessary variables are copied into GPU memory in advance, and only the minimum necessary acc update host/device is used in the offloaded region.

References

  1. ABCI System Overview. AIST. https://docs.abci.ai/en/system-overview/. Accessed 08 Oct 2021

  2. About Fugaku. https://www.r-ccs.riken.jp/en/fugaku/about/. Accessed 24 Aug 2021

  3. AI Bridging Cloud Infrastructure (ABCI): National Institute of Advanced Industrial Science and Technology (AIST). https://abci.ai/en/about_abci/. Accessed 08 Oct 2021

  4. Chainer. https://chainer.org/. Accessed 24 Aug 2021

  5. NVIDIA A100 TENSOR CORE GPU. https://www.nvidia.com/en-us/data-center/a100/. Accessed 24 Aug 2021

  6. NVIDIA GPUDirect. https://developer.nvidia.com/gpudirect. Accessed 24 Aug 2021

  7. NVIDIA V100 TENSOR CORE GPU. https://www.nvidia.com/en-us/data-center/a100/. Accessed 24 Aug 2021

  8. OpenACC. http://www.openacc.org/. Accessed 24 Aug 2021

  9. Arayeshnia, A., Keshtkar, A., Amiri, S.: Realistic human head voxel model for brain microwave imaging. In: 2017 Iranian Conference on Electrical Engineering (ICEE), pp. 1660–1663. IEEE (2017). https://doi.org/10.1109/iraniancee.2017.7985315

  10. Chuang, P.-Y., Foertter, F.S.: An example of porting PETSc applications to heterogeneous platforms with OpenACC. In: Chandrasekaran, S., Juckeland, G. (eds.) WACCPD 2017. LNCS, vol. 10732, pp. 3–19. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-74896-2_1

    Chapter  Google Scholar 

  11. Golub, G.H., Ye, Q.: Inexact preconditioned conjugate gradient method with inner-outer iteration. SIAM J. Sci. Comput. 21, 1305–1320 (1999)

    Article  MathSciNet  Google Scholar 

  12. Gotz, M., Anzt, H.: Machine learning-aided numerical linear algebra: convolutional neural networks for the efficient preconditioner generation. In: 2018 IEEE/ACM 9th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems (ScalA), pp. 49–56. IEEE (2018). https://doi.org/10.1109/scala.2018.00010

  13. Ichimura, T., Fujita, K., Hori, M., Maddegedara, L., Ueda, N., Kikuchi, Y.: A fast scalable iterative implicit solver with green’s function-based neural networks. In: 2020 IEEE/ACM 11th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems (ScalA), pp. 61–68. IEEE (2020). https://doi.org/10.1109/scala51936.2020.00013

  14. Liang, J., Hua, R., Zhang, H., Zhu, W., Fu, Y.: Accelerated molecular dynamics simulation of Silicon Crystals on TaihuLight using OpenACC. Parallel Comput. 99, 102667 (2020). https://doi.org/10.1016/j.parco.2020.102667

    Article  Google Scholar 

  15. Londhe, A., et al.: Adaptively accelerating FWM2DA seismic modelling program on multi-core CPU and GPU architectures. Comput. Geosci. 146, 104637 (2021). https://doi.org/10.1016/j.cageo.2020.104637

    Article  Google Scholar 

  16. Sappl, J., Seiler, L., Harders, M., Rauch, W.: Deep learning of preconditioners for conjugate gradient solvers in urban water related problems (2019)

    Google Scholar 

  17. Shan, H., Zhao, Z., Wagner, M.: Accelerating the performance of modal aerosol module of E3SM using OpenACC. In: Wienke, S., Bhalachandra, S. (eds.) WACCPD 2019. LNCS, vol. 12017, pp. 47–65. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-49943-3_3

    Chapter  Google Scholar 

  18. Winget, J.M., Hughes, T.J.: Solution algorithms for nonlinear transient heat conduction analysis employing element-by-element iterative strategies. Comput. Methods Appl. Mech. Eng. 52(1), 711–815 (1985). https://doi.org/10.1016/0045-7825(85)90015-5

    Article  MathSciNet  MATH  Google Scholar 

  19. Xue, W., Roy, C.J.: Multi-GPU performance optimization of a computational fluid dynamics code using OpenACC. Concurr. Comput. Pract. Exp. 33(5), e6036 (2020). https://doi.org/10.1002/cpe.6036

    Article  Google Scholar 

  20. Yamaguchi, T., et al.: GPU implementation of a sophisticated implicit low-order finite element solver with FP21-32-64 computation using OpenACC. In: Wienke, S., Bhalachandra, S. (eds.) WACCPD 2019. LNCS, vol. 12017, pp. 3–24. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-49943-3_1

    Chapter  Google Scholar 

Download references

Results were obtained using AI Bridging Cloud Infrastructure (ABCI) at National Institute of Advanced Industrial Science and Technology (AIST). This work was supported by MEXT as “Program for Promoting Researches on the Supercomputer Fugaku” (Large-scale numerical simulation of earthquake generation, wave propagation and soil amplification: hp200126, hp210171) and JSPS KAKENHI Grant Numbers JP18H05239, JP18H03795, JP17K14719.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kohei Fujita .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Fujita, K., Kikuchi, Y., Ichimura, T., Hori, M., Maddegedara, L., Ueda, N. (2022). GPU Porting of Scalable Implicit Solver with Green’s Function-Based Neural Networks by OpenACC. In: Bhalachandra, S., Daley, C., Melesse Vergara, V. (eds) Accelerator Programming Using Directives. WACCPD 2021. Lecture Notes in Computer Science(), vol 13194. Springer, Cham. https://doi.org/10.1007/978-3-030-97759-7_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-97759-7_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-97758-0

  • Online ISBN: 978-3-030-97759-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics