Vortex lattice and panel methods belong to a broad family of aerodynamic codes based on potential flow theory. They are used in preliminary aerodynamic studies in early stages of aircraft design where hundreds of thousands candidate configurations are analyzed. In this paper, we describe their efficient implementation on modern multi- and many-core architectures. We show how to bridge the ‘ninja gap’, defined as the performance gap between an unoptimized C/C\(++\) code and best optimized CPU code. We port the Vortex Lattice Method to a Graphics Processing Unit using the OpenACC standard. An elegant solution for implementation of data movements for C\(++\) classes is also presented.
This is a preview of subscription content, access via your institution.
Buy single article
Instant access to the full article PDF.
Price excludes VAT (USA)
Tax calculation will be finalised during checkout.
Dashti M, Fedorova A, Funston J, Gaud F, Lachaize R, Lepers B, Quema V, Roth M (2013) Traffic management: a holistic approach to memory placement on NUMA systems. SIGARCH Comput Archit News 41(1):381–394
Domeika M (2008) Scalar optimization and usability, Chapter 5. In: Domeika M (ed) Software development for embedded multi-core systems. Newnes, Burlington, pp 139–171
Hager G, Wellein G (2010) Introduction to high performance computing for scientists and engineers, 1st edn. CRC Press, Boca Raton
Hess JL (1990) Panel methods in computational fluid dynamics. Annu Rev Fluid Mech 22(1):255–274
Katz J, Plotkin A (2001) Low-speed aerodynamics. Cambridge aerospace series. Cambridge University Press, Cambridge
Jiri K, Michael S, Andrew A, Dirk P (2014) Accelerating a C\(++\) CFD code with OpenACC. In: Proceedings of the first workshop on accelerator programming using directives, WACCPD ’14, pp 47–54. IEEE Press, Piscataway
Maleki S, Yaoqing G, Garzaran MJ, Wong T, Padua DA (2011) An evaluation of vectorizing compilers. In: 2011 International conference on parallel architectures and compilation techniques (PACT), pp 372–382
Murua J, Palacios R, Graham JMR (2012) Applications of the unsteady vortex-lattice method in aircraft aeroelasticity and flight dynamics. Progr Aerosp Sci 55(0):46–72
Nickolls J, Buck I, Garland M, Skadron K (2008) Scalable parallel programming with CUDA. Queue 6(2):40–53
Niemeyer KE, Sung C-J (2014) Recent progress and challenges in exploiting graphics processors in computational fluid dynamics. J Supercomput 67(2):528–564
Piperni P, DeBlois A, Henderson R (2013) Development of a multilevel multidisciplinary-optimization capability for an industrial environment. AIAA J 51(10):2335–2352
Saad Y (2003) Iterative methods for sparse linear systems, 2nd edn. Society for Industrial and Applied Mathematics, Philadelphia
Satish N, Kim C, Chhugani J, Saito H, Krishnaiyer R, Smelyanskiy M, Girkar M, Dubey P (2012) Can traditional programming bridge the Ninja performance gap for parallel computing applications? SIGARCH Comput Archit News 40(3):440–451
Süß M, Leopold C (2008) Common mistakes in openmp and how to avoid them: a collection of best practices. In: Proceedings of the 2005 and 2006 international conference on OpenMP shared memory parallel programming, IWOMP’05/IWOMP’06, pp 312–323. Springer, Berlin
Talbot SAM, Kelly PHJ (1998) Stable performance for cc-NUMA using first-touch page placement and reactive proxies. In: Schaeffer J (ed) Springer international series in engineering and computer science, 478th edn., High performance computing systems and applicationsSpringer, New York, pp 251–266
Williams S, Waterman A, Patterson D (2009) Roofline: an insightful visual performance model for multicore architectures. Commun ACM 52(4):65–76
This work was supported by an NSERC Engage Grant with Cray Canada ULC as an industrial partner. M. Chrust and E. Laurendeau would like to thank Cray Canada ULC for providing access to the computing resources.
About this article
Cite this article
Chrust, M., Laurendeau, E. & Ostiguy, L. Accelerating low-fidelity aerodynamic codes on multi- and many-core architectures. J Supercomput 71, 3456–3481 (2015). https://doi.org/10.1007/s11227-015-1444-6