GPU Best Practices for HPC Applications at Industry Scale

Wang, Peng; Posey, Stan

doi:10.1007/978-3-642-16405-7_9

Peng Wang⁷ &
Stan Posey⁷

Part of the book series: Lecture Notes in Earth System Sciences ((LNESS))

2819 Accesses

Abstract

Current trends in high performance computing (HPC) are moving towards the availability of several cores on the same chip of contemporary processors in order to achieve speed-up through exploiting the potential of fine-grain parallelism in applications. The trend is led by graphics processing units (GPUs) which have recently been developed exclusively for computational tasks as massively-parallel co-processors to conventional x86 CPUs. Since the introduction in 2006 of the NVIDIA Tesla GPU and CUDA programming environment, the HPC community has achieved noted performance gains across a broad range of application software. In particular, various scientific research disciplines within computational physics and chemistry have reported performance levels as high as two orders of magnitude over current quad-core CPUs. During 2010 an extensive set of new HPC architectural features were offered in the third generation Tesla and CUDA (codenamed Fermi), giving engineering disciplines a similar opportunity to expand use of GPUs for applications relevant to industry modeling and simulation. Similar to the scientific research community, practical applications in industry observe constant growth in model fidelity, but parallel efficiency of commercial software and job completion times also become important factors behind decisions on model size and scale, and level of physics features to include. This work examines algorithmic development best practices, and performance results of application software for the Tesla Fermi architecture in modelling and simulation examples relevant to industry-scale HPC practice. Included are GPU implementations of computational structural mechanics (CSM) and computational fluid dynamics (CFD) software that support mechanical product design in manufacturing industries. Specifically, the critical requirements of memory optimization and storage formats are discussed for grid-based direct solvers that appear in CSM and for highly irregular sparse matrices that require iterative solver schemes in CFD.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

eBook: USD 16.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

ACUSIM Corporation and AcuSolve: www.acusim.com
Andrew C, Fernando C, Rainald L, John W In: 19th AIAA computational fluid dynamics, June 22–25, San Antonio, Texas
Google Scholar
Brandvik T, Pullan G (2009) An accelerated 3D Navier-Stokes solver for flows in turbomachines. In: Proceedings of GT2009 ASME turbo expo, (2009) power for land, sea and air, June 8–12, Orlando, USA
Google Scholar
Kodiyalam S, Kremenetsky M, Posey S (2007) Balanced HPC infrastructure for CFD and associated multidiscipline simulations of engineering systems. In: Proceedings of the 7th Asia CFD conference 2007, Nov 26–30, Bangalore, India
Google Scholar
LS-DYNA User’s Manual Version 971, Livermore Software Technology Corporation, Livermore, CA, 2007
Google Scholar
Lucas R, Wagenbreth G, Davis D (2007) Implementing a GPU-enhanced cluster for large scale simulations. In: I/ITSEC, Orlando, FL, USA
Google Scholar
Michalakes J, Vachharajani M (2008) GPU acceleration of numerical weather prediction. Parallel Process Lett 18(4):531–548
Article MathSciNet Google Scholar
NVIDIA Corporation (2008) NVIDIA CUDA compute unified device architecture 2.0 programming guide
Google Scholar
Palix Technologies, LLC (2010) http://www.palixtech.com. Advanced numerical design solver and solver white paper

Download references

Author information

Authors and Affiliations

NVIDIA, Santa Clara, CA, USA
Peng Wang & Stan Posey

Authors

Peng Wang
View author publications
You can also search for this author in PubMed Google Scholar
Stan Posey
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Peng Wang .

Editor information

Editors and Affiliations

University of Minnesota, Dep. of Earth Sciences and Minnesota, Supercomputing Institute, Pillsbury Hall 23, Minneapolis, 55455, Minnesota, USA
David A. Yuen
Network Information Center, Comuter Center and Computer, Zhong Guan Cun 4, Beijing, 100190, China, People's Republic
Long Wang
Supercomputing Center, Zhong Guan Cun 4, Beijing, 100190, China, People's Republic
Xuebin Chi
, Computer Science, University of Houston, Calhoun Street 4800, Houston, 77204, Texas, USA
Lennart Johnsson
Inst. Process Engineering (IPE), Chinese Academy of Sciences, Zhongguancun North Second Street 1, Beijing, 100190, China, People's Republic
Wei Ge
, Laboratory of Computational Geodynamics,, Chinese Academy of Sciences, Yu Quan Lu 19a, Beijing, 100049, China, People's Republic
Yaolin Shi

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Wang, P., Posey, S. (2013). GPU Best Practices for HPC Applications at Industry Scale. In: Yuen, D., Wang, L., Chi, X., Johnsson, L., Ge, W., Shi, Y. (eds) GPU Solutions to Multi-scale Problems in Science and Engineering. Lecture Notes in Earth System Sciences. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-16405-7_9

Download citation

DOI: https://doi.org/10.1007/978-3-642-16405-7_9
Published: 09 January 2013
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-16404-0
Online ISBN: 978-3-642-16405-7
eBook Packages: Earth and Environmental ScienceEarth and Environmental Science (R0)

Publish with us

Policies and ethics