Skip to main content

Evaluation of the NEC Vector Engine for Legacy CFD Codes

Part of the Lecture Notes in Computer Science book series (LNTCS,volume 12728)

Abstract

Many codes that are still in production use trace their origins to code developed during the vector supercomputing era from the 1970’s to 1990’s. The recently released NEC Vector Engine (VE) provides an opportunity to exploit this vector heritage. The VE can provide state-of-the-art performance without a complete rewrite of a well-validated codebase. Programs do not require an additional level of abstraction to use the capabilities of the VE. Given the time and cost required to port or rewrite codes, this is an attractive solution. Further tuning as described in this paper can realize maximum performance.

The goal was to assess how the NEC VE’s performance and ease of use compare with that of existing CPU architectures (e.g. AMD, Intel) using a legacy Computational Fluid Dynamics (CFD) solver, FDL3DI written in Fortran. FDL3DI was originally vectorized and optimized for efficient operation on vector processing machines. The NEC VE’s architecture, high memory bandwidth and ability to compile Fortran was the primary motivation for this evaluation.

Through profiling and modifying the key compute kernels using typical vector and NEC VE specific optimizations, the code was successfully able to utilize the vector engine hardware with minimal modification of the code. Scalar code developed later in FDL3DI’s lifetime was substituted with vector friendly implementations. With optimizations, this vector architecture was found to be 3× faster for main-memory bound problems with the CPU architectures competitive for smaller problem sizes. This performance using standard well-known techniques is considered to be a key benefit of this architecture.

Keywords

  • Vectorization
  • CFD
  • Optimization

Distribution A: Approved for public release; Distribution unlimited.

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-030-78713-4_14
  • Chapter length: 17 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   84.99
Price excludes VAT (USA)
  • ISBN: 978-3-030-78713-4
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   109.99
Price excludes VAT (USA)
Fig. 1.
Fig. 2.
Fig. 3.
Fig. 4.
Fig. 5.
Fig. 6.
Fig. 7.
Fig. 8.
Fig. 9.
Fig. 10.
Fig. 11.
Fig. 12.
Fig. 13.

References

  1. Komatsu, K., et al.: Performance evaluation of a vector supercomputer SX-Aurora TSUBASA. In: IEEE Conference Proceedings, USA, pp. 685–696 (2018)

    Google Scholar 

  2. VEOS high level design. https://veos-sxarr-nec.github.io/doc/VEOS_high_level_design.pdf

  3. NEC Numeric Library Collection User’s Guide. https://www.hpc.nec/documents/sdk/SDK_NLC/UsersGuide/main/en/index.html

  4. AMD 7702 Datasheet, April 2020. https://www.amd.com/system/files/documents/AMD-EPYC-7002-Series-Datasheet.pdf

  5. Intel 8160 Datasheet. https://ark.intel.com/content/www/us/en/ark/products/192474/intel-xeon-platinum-8260-processor-35-75m-cache-2-40-ghz.html

  6. Second Generation Intel® Xeon® Scalable Processors Specification Update, October 2020. https://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/xeon-scalable-spec-update.pdf

  7. Quinn, M.J.: Parallel Programming in C with MPI and OpenMP. McGraw-Hill Education (2004)

    Google Scholar 

  8. Boris, J.P., Landsberg, A.M., Oran, E.S., Gardner, J.H.: LCPFCT - a flux-corrected transport algorithm for solving generalized continuity equations. NRL Memorandum Report 93-7192 (1993)

    Google Scholar 

  9. Gaitonde, D., Visbal, M.: High-order schemes for Navier-Stokes equations: algorithm and implementation into FDL3DI. Technical report AFRL-VA-WP-TR-1998-3060, Air Force Research Laboratory, Wright-Patterson AFB (1998)

    Google Scholar 

  10. Garmann, D.J., Visbal, M.R.: AFRL contributions to the third international workshop on high-order CFD methods. In: Third International Workshop on High-Order CFD Methods (2015)

    Google Scholar 

  11. Gordnier, R.E., Visbal, M.R.: Numerical simulation of delta-wing roll. Aerosp. Sci. Technol. 6, 347–357 (1998)

    CrossRef  Google Scholar 

  12. Ducros, F., et al.: Large-eddy simulation of the shock/turbulence interaction. J. Comput. Phys. 152, 517–549 (1999)

    CrossRef  Google Scholar 

  13. PROGINF/FTRACE User’s Guide. https://www.hpc.nec/documents/sdk/pdfs/g2at03e-PROGINF_FTRACE_User_Guide_en.pdf

  14. Fortran Compiler User’s Guide. https://www.hpc.nec/documents/sdk/pdfs/g2af02e-FortranUsersGuide-020.pdf

  15. SX-Aurora TSUBASA Performance Tuning Guide. https://www.hpc.nec/documents/guide/pdfs/AuroraVE_TuningGuide.pdf

  16. Williams, S., Waterman, A., Patterson, D.: Roofline: an insightful visual performance model for multicore architectures. Commun. ACM. 52(4), 65–76 (2009). https://doi.org/10.1145/1498765.1498785.ISSN0001-0782

    CrossRef  Google Scholar 

  17. Mantevo Project, Mantevo Organization (2020). https://mantevo.github.io/

  18. Department of Computer and Information Science, University of Oregon Advanced Computing Laboratory, LANL, NM Research Centre Julich, ZAM, Germany, 24 July 2020. https://www.cs.uoregon.edu/research/tau/tau-referenceguide.pdf

Download references

Acknowledgements

This project is co-sponsored by the U.S. Department of Defense Foreign Comparative Testing Program within the Office of the Undersecretary of Defense for Research & Engineering, the DoD High Performance Computing Modernization Program, and by the Office of Naval Research through the Naval Research Laboratory 6.1 Materials Science Task Area. The collaboration with NEC was conducted via the NRL CRADA-20-716. The authors would like to thank the NEC consultants and supporting hardware and software teams who helped us understand and address any issues with the platform. Finally, we would like to thank Dr. D. Garmann at the U.S. Air Force Research Laboratory for his guidance on the FDL3DI code.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Keith Obenschain .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Verify currency and authenticity via CrossMark

Cite this paper

Obenschain, K., Khine, Y.Y., Mathur, R., Patnaik, G., Rosenberg, R. (2021). Evaluation of the NEC Vector Engine for Legacy CFD Codes. In: Chamberlain, B.L., Varbanescu, AL., Ltaief, H., Luszczek, P. (eds) High Performance Computing. ISC High Performance 2021. Lecture Notes in Computer Science(), vol 12728. Springer, Cham. https://doi.org/10.1007/978-3-030-78713-4_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-78713-4_14

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-78712-7

  • Online ISBN: 978-3-030-78713-4

  • eBook Packages: Computer ScienceComputer Science (R0)