Skip to main content

GPU Best Practices for HPC Applications at Industry Scale

  • Chapter
  • First Online:
GPU Solutions to Multi-scale Problems in Science and Engineering

Part of the book series: Lecture Notes in Earth System Sciences ((LNESS))

  • 2819 Accesses

Abstract

Current trends in high performance computing (HPC) are moving towards the availability of several cores on the same chip of contemporary processors in order to achieve speed-up through exploiting the potential of fine-grain parallelism in applications. The trend is led by graphics processing units (GPUs) which have recently been developed exclusively for computational tasks as massively-parallel co-processors to conventional x86 CPUs. Since the introduction in 2006 of the NVIDIA Tesla GPU and CUDA programming environment, the HPC community has achieved noted performance gains across a broad range of application software. In particular, various scientific research disciplines within computational physics and chemistry have reported performance levels as high as two orders of magnitude over current quad-core CPUs. During 2010 an extensive set of new HPC architectural features were offered in the third generation Tesla and CUDA (codenamed Fermi), giving engineering disciplines a similar opportunity to expand use of GPUs for applications relevant to industry modeling and simulation. Similar to the scientific research community, practical applications in industry observe constant growth in model fidelity, but parallel efficiency of commercial software and job completion times also become important factors behind decisions on model size and scale, and level of physics features to include. This work examines algorithmic development best practices, and performance results of application software for the Tesla Fermi architecture in modelling and simulation examples relevant to industry-scale HPC practice. Included are GPU implementations of computational structural mechanics (CSM) and computational fluid dynamics (CFD) software that support mechanical product design in manufacturing industries. Specifically, the critical requirements of memory optimization and storage formats are discussed for grid-based direct solvers that appear in CSM and for highly irregular sparse matrices that require iterative solver schemes in CFD.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • ACUSIM Corporation and AcuSolve: www.acusim.com

  • Andrew C, Fernando C, Rainald L, John W In: 19th AIAA computational fluid dynamics, June 22–25, San Antonio, Texas

    Google Scholar 

  • Brandvik T, Pullan G (2009) An accelerated 3D Navier-Stokes solver for flows in turbomachines. In: Proceedings of GT2009 ASME turbo expo, (2009) power for land, sea and air, June 8–12, Orlando, USA

    Google Scholar 

  • Kodiyalam S, Kremenetsky M, Posey S (2007) Balanced HPC infrastructure for CFD and associated multidiscipline simulations of engineering systems. In: Proceedings of the 7th Asia CFD conference 2007, Nov 26–30, Bangalore, India

    Google Scholar 

  • LS-DYNA User’s Manual Version 971, Livermore Software Technology Corporation, Livermore, CA, 2007

    Google Scholar 

  • Lucas R, Wagenbreth G, Davis D (2007) Implementing a GPU-enhanced cluster for large scale simulations. In: I/ITSEC, Orlando, FL, USA

    Google Scholar 

  • Michalakes J, Vachharajani M (2008) GPU acceleration of numerical weather prediction. Parallel Process Lett 18(4):531–548

    Article  MathSciNet  Google Scholar 

  • NVIDIA Corporation (2008) NVIDIA CUDA compute unified device architecture 2.0 programming guide

    Google Scholar 

  • Palix Technologies, LLC (2010) http://www.palixtech.com. Advanced numerical design solver and solver white paper

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Peng Wang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Wang, P., Posey, S. (2013). GPU Best Practices for HPC Applications at Industry Scale. In: Yuen, D., Wang, L., Chi, X., Johnsson, L., Ge, W., Shi, Y. (eds) GPU Solutions to Multi-scale Problems in Science and Engineering. Lecture Notes in Earth System Sciences. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-16405-7_9

Download citation

Publish with us

Policies and ethics