Software Cost Analysis of GPU-Accelerated Aeroacoustics Simulations in C++ with OpenACC

  • Marco Nicolini
  • Julian MillerEmail author
  • Sandra Wienke
  • Michael Schlottke-Lakemper
  • Matthias Meinke
  • Matthias S. Müller
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9945)


Aeroacoustics simulations leverage the tremendous computational power of today’s supercomputers, e.g., to predict the noise emissions of airplanes. The emergence of GPUs that are usable through directive-based programming models like OpenACC promises a cost-efficient solution for flow-induced noise simulations with respect to hardware expenditure and development time. However, OpenACC’s capabilities for real-world C++ codes have been scarcely investigated so far and software costs are rarely evaluated and modeled for this kind of high-performance projects. In this paper, we present our OpenACC parallelization of ZFS, an aeroacoustics simulation framework written in C++, and its early performance results. From our implementation work, we derive common pitfalls and lessons-learned for real-world C++ codes using OpenACC. Furthermore, we borrow software cost estimation techniques from software engineering to evaluate the development efforts needed in a directive-based HPC environment. We discuss applicability and challenges of the popular COCOMO II model applied to the parallelization of ZFS.


Discontinuous Galerkin Discontinuous Galerkin Method Code Transformation Reuse Model Effort Multiplier 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Blair, S., Albing, C., Grund, A., Jocksch, A.: Accelerating an MPI lattice Boltzmann code using OpenACC. In: Proceedings of the Second Workshop on Accelerator Programming Using Directives, WACCPD 2015, pp. 3:1–3:9. ACM, New York (2015)Google Scholar
  2. 2.
    Boehm, B.: Anchoring the software process. IEEE Softw. 13(4), 73–82 (1996)CrossRefGoogle Scholar
  3. 3.
    Boehm, B., Abts, C., Brown, A.W., Chulani, S., Clark, B., Horowitz, E., Madachy, R., Reifer, D., Steece, B.: COCOMO II model definition manual, version 2.1. Technical report, University of Southern California (2000)Google Scholar
  4. 4.
    Boehm, B., Abts, C., Chulani, S.: Software development cost estimation approaches – a survey. Ann. Softw. Eng. 10(1), 177–205 (2000)CrossRefzbMATHGoogle Scholar
  5. 5.
    Boehm, B., Clark, B., Horowitz, E., Westland, C., Madachy, R., Selby, R.: Cost models for future software life cycle processes: COCOMO 2.0. Ann. Softw. Eng. 1(1), 57–94 (1995)CrossRefGoogle Scholar
  6. 6.
    Boehm, B.W., Madachy, R., Steece, B., et al.: Software Cost Estimation with Cocomo II with CDROM. Prentice Hall PTR, Englewood Cliffs (2000)Google Scholar
  7. 7.
    Carpenter, M., Kennedy, C.: Fourth-order 2n-storage Runge-Kutta schemes. NASA Technical Memorandum 109112, pp. 871–885 (1994)Google Scholar
  8. 8.
    Dongarra, J., Graybill, R., Harrod, W., Lucas, R., Lusk, E., Luszczek, P., Mcmahon, J., Snavely, A., Vetter, J., Yelick, K., Alam, S., Campbell, R., Carrington, L., Chen, T.Y., Khalili, O., Meredith, J., Tikir, M.: DARPA’s HPCS program: history, models, tools, languages. In: Advances in COMPUTERS High Performance Computing, Advances in Computers, vol. 72, pp. 1–100. Elsevier, Amsterdam (2008)Google Scholar
  9. 9.
    Ebcioglu, K., Sarkar, V., El-Ghazawi, T., Urbanic, J., Center, P.: An experiment in measuring the productivity of three parallel programming languages. In: Workshop on Productivity and Performance in High-End Computing (P-PHEC), pp. 30–36 (2006)Google Scholar
  10. 10.
    Ewert, R., Schröder, W.: Acoustic perturbation equations based on flow decomposition via source filtering. J. Comput. Phys. 188(2), 365–398 (2003)MathSciNetCrossRefzbMATHGoogle Scholar
  11. 11.
    Funk, A., Basili, V., Hochstein, L., Kepner, J.: Application of a development time productivity metric to parallel software development. In: Proceedings of the Second International Workshop on Software Engineering for High Performance Computing System Applications, pp. 8–12. ACM (2005)Google Scholar
  12. 12.
    German Science Foundation (DFG): COCOMO II Cost Estimation Questionnaire (2000)Google Scholar
  13. 13.
    Hindenlang, F., Gassner, G., Altmann, C., Beck, A., Staudenmaier, M., Munz, C.D.: Explicit discontinuous Galerkin methods for unsteady problems. Comput. Fluids 61, 86–93 (2012)MathSciNetCrossRefGoogle Scholar
  14. 14.
    Hochstein, L., Carver, J., Shull, F., Asgari, S., Basili, V., Hollingsworth, J.K., Zelkowitz, M.V.: Parallel programmer productivity: a case study of novice parallel programmers. In: Proceedings of the ACM/IEEE SC 2005 Conference on Supercomputing, pp. 35–35 (2005)Google Scholar
  15. 15.
    Hochstein, L., Basili, V.R., Zelkowitz, M.V., Hollingsworth, J.K., Carver, J.: Combining self-reported and automatic data to improve programming effort measurement. In: Proceedings of the 10th European Software Engineering Conference Held Jointly with 13th ACM SIGSOFT International Symposium on Foundations of Software Engineering, ESEC/FSE-13, pp. 356–365. ACM, New York (2005)Google Scholar
  16. 16.
    Hwu, W.M., Chang, L.W., Kim, H.S., Dakkak, A., Hajj, I.E.: Transitioning HPC software to exascale heterogeneous computing. In: Computational Electromagnetics International Workshop (CEM), 2015, pp. 1–2 (2015)Google Scholar
  17. 17.
    Jorgensen, M., Shepperd, M.: A systematic review of software development cost estimation studies. IEEE Trans. Softw. Eng. 33(1), 33–53 (2007)CrossRefGoogle Scholar
  18. 18.
    Kepner, J.: High performance computing productivity model synthesis. Int. J. High Perform. Comput. Appl. 18(4), 505–516 (2004)CrossRefGoogle Scholar
  19. 19.
    Kepner, J.: HPC productivity: an overarching view. Int. J. High Perform. Comput. Appl. 18(4), 393–397 (2004)CrossRefGoogle Scholar
  20. 20.
    Kraus, J., Schlottke, M., Adinetz, A., Pleiter, D.: Accelerating a C++ CFD code with OpenACC. In: Proceedings of the First Workshop on Accelerator Programming Using Directives, pp. 47–54. IEEE Press (2014)Google Scholar
  21. 21.
    Lee, S., Vetter, J.S.: Early evaluation of directive-based GPU programming models for productive exascale computing. In: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, SC 2012, pp. 23:1–23:11. IEEE Computer Society Press, Los Alamitos (2012)Google Scholar
  22. 22.
    Levesque, J.M., Sankaran, R., Grout, R.: Hybridizing s3d into an exascale application using OpenACC: an approach for moving to multi-petaflops and beyond. In: 2012 International Conference for High Performance Computing, Networking, Storage and Analysis (SC), pp. 1–11 (2012)Google Scholar
  23. 23.
    Lintermann, A., Meinke, M., Schröder, W.: Fluid mechanics based classification of the respiratory efficiency of several nasal cavities. Comput. Biol. Med. 43(11), 1833–1852 (2013)CrossRefGoogle Scholar
  24. 24.
    NVIDIA: Parallel Forall - OpenACC: Directives for GPUs (2012).
  25. 25.
    Peng, I.B., Markidis, S., Vaivads, A., Vencels, J., Deca, J., Lapenta, G., Hart, A., Laure, E.: Acceleration of a particle-in-cell code for space plasma simulations with OpenACC. In: EGU General Assembly Conference Abstracts, vol. 17, p. 1276 (2015)Google Scholar
  26. 26.
    Perry, D.E., Staudenmayer, N.A., Votta, L.G.: Understanding and improving time usage in software development. In: Trends in Software: Software Process, vol. 5, pp. 111–135. Wiley, New York (1996)Google Scholar
  27. 27.
    Pogorelov, A., Meinke, M., Schröder, W.: Cut-cell method based large-eddy simulation of tip-leakage flow. Phys. Fluids 27(7), 075106 (2015)CrossRefGoogle Scholar
  28. 28.
    Schlottke, M., Cheng, H.J., Lintermann, A., Meinke, M., Schröder, W.: A direct-hybrid method for computational aeroacoustics. AIAA Paper 2015-3133 (2015)Google Scholar
  29. 29.
    Schneiders, L., Günther, C., Meinke, M., Schröder, W.: An efficient conservative cut-cell method for rigid bodies interacting with viscous compressible flows. J. Comput. Phys. 311, 62–86 (2016)MathSciNetCrossRefGoogle Scholar
  30. 30.
    Sharma, N., Bajpai, A., Litoriya, R.: A comparison of software cost estimation methods: a survey. Int. J. Comput. Sci. Appl. 1(3), 121–127 (2012)Google Scholar
  31. 31.
    The Green 500: The Green500 List - November 2015 (2015).
  32. 32.
    Wienke, S., Iliev, H., an Mey, D., Müller, M.S.: Modeling the productivity of HPC systems on a computing center scale. In: Kunkel, J.M., Ludwig, T. (eds.) ISC High Performance 2015. LNCS, vol. 9137, pp. 358–375. Springer, Heidelberg (2015)CrossRefGoogle Scholar
  33. 33.
    Wienke, S., an Mey, D., Müller, M.S.: Accelerators for technical computing: is it worth the pain? A TCO perspective. In: Kunkel, J.M., Ludwig, T., Meuer, H.W. (eds.) ISC 2013. LNCS, vol. 7905, pp. 330–342. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  34. 34.
    Wienke, S., Miller, J., Schulz, M., Müller, M.S.: Development effort estimation in HPC. In: SC 2016: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. IEEE Computer Society (2016)Google Scholar
  35. 35.
    Wienke, S., Springer, P., Terboven, C., an Mey, D.: OpenACC — first experiences with real-world applications. In: Kaklamanis, C., Papatheodorou, T., Spirakis, P.G. (eds.) Euro-Par 2012. LNCS, vol. 7484, pp. 859–870. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  36. 36.
    Williams, S., Waterman, A., Patterson, D.: Roofline: an insightful visual performance model for multicore architectures. Commun. ACM 52(4), 65–76 (2009)CrossRefGoogle Scholar
  37. 37.
    Wolfe, M.: OpenACC for Multicore CPUs (2015).
  38. 38.
    Xia, Y., Lou, J., Luo, H., Edwards, J., Mueller, F.: OpenACC acceleration of an unstructured CFD solver based on a reconstructed discontinuous Galerkin method for compressible flows. Int. J. Numer. Meth. Fluids 78(3), 123–139 (2015)MathSciNetCrossRefGoogle Scholar
  39. 39.
    Xu, R., Chandrasekaran, S., Chapman, B.: Exploring programming multi-GPUs using OpenMP and OpenACC-based hybrid model. In: 2013 IEEE 27th International Parallel and Distributed Processing Symposium Workshops PhD Forum (IPDPSW), pp. 1169–1176 (2013)Google Scholar
  40. 40.
    Zelkowitz, M., Basili, V., Asgari, S., Hochstein, L., Hollingsworth, J., Nakamura, T.: Measuring productivity on high performance computers. In: IEEE International Symposium on Software Metrics 2005, p. 6 (2005)Google Scholar

Copyright information

© Springer International Publishing AG 2016

Authors and Affiliations

  • Marco Nicolini
    • 1
  • Julian Miller
    • 1
    Email author
  • Sandra Wienke
    • 1
    • 2
  • Michael Schlottke-Lakemper
    • 2
    • 3
  • Matthias Meinke
    • 3
  • Matthias S. Müller
    • 1
    • 2
  1. 1.IT CenterRWTH Aachen UniversityAachenGermany
  2. 2.JARA – High-Performance ComputingAachenGermany
  3. 3.Institute of AerodynamicsRWTH Aachen UniversityAachenGermany

Personalised recommendations