Advertisement

Early Performance Assessment of the ThunderX2 Processor for Lattice Based Simulations

  • Enrico Calore
  • Alessandro Gabbana
  • Fabio Rinaldi
  • Sebastiano Fabio SchifanoEmail author
  • Raffaele Tripiccione
Conference paper
  • 41 Downloads
Part of the Lecture Notes in Computer Science book series (LNCS, volume 12043)

Abstract

This paper presents an early performance assessment of the ThunderX2, the most recent Arm-based multi-core processor designed for HPC applications. We use as benchmarks well known stencil-based LBM and LQCD algorithms, widely used to study respectively fluid flows, and interaction properties of elementary particles. We run benchmark kernels derived from OpenMP production codes, we measure performance as a function of the number of threads, and evaluate the impact of different choices for data layout. We then analyze our results in the framework of the roofline model, and compare with the performances measured on mainstream Intel Skylake processors. We find that these Arm based processors reach levels of performance competitive with those of other state-of-the-art options.

Keywords

ThunderX2 Lattice-Boltzmann Lattice-QCD 

Notes

Acknowledgments

This work has been done in the framework of the COKA, and COSA projects funded by INFN. We would like to thank CINECA (Italy) and Università di Ferrara for access to their HPC systems. All runs on the ThunderX2 have been performed on computational resources provided and supported by E4 Computer Engineering and installed at CINECA.

References

  1. 1.
    Pruitt, D.D., Freudenthal, E.A.: Preliminary investigation of mobile system features potentially relevant to HPC. In: 2016 4th International Workshop on Energy Efficient Supercomputing (E2SC), pp. 54–60, November 2016.  https://doi.org/10.1109/E2SC.2016.013
  2. 2.
    Calore, E., Mantovani, F., Ruiz, D.: Advanced performance analysis of HPC workloads on Cavium ThunderX. In: 2018 International Conference on High Performance Computing Simulation (HPCS), pp. 375–382, July 2018.  https://doi.org/10.1109/HPCS.2018.00068
  3. 3.
    Fürlinger, K., Klausecker, C., Kranzlmüller, D.: Towards energy efficient parallel computing on consumer electronic devices. In: Kranzlmüller, D., Toja, A.M. (eds.) ICT-GLOW 2011. LNCS, vol. 6868, pp. 1–9. Springer, Heidelberg (2011).  https://doi.org/10.1007/978-3-642-23447-7_1CrossRefGoogle Scholar
  4. 4.
    Rajovic, N., et al.: The Mont-Blanc prototype: an alternative approach for HPC systems. In: SC 2016: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 444–455, November 2016Google Scholar
  5. 5.
    Yokoyama, D., Schulze, B., Borges, F., Mc Evoy, G.: The survey on arm processors for HPC. J. Supercomput. 75(10), 7003–7036 (2019).  https://doi.org/10.1007/s11227-019-02911-9CrossRefGoogle Scholar
  6. 6.
    Oyarzun, G., Borrell, R., Gorobets, A., Mantovani, F., Oliva, A.: Efficient CFD code implementation for the ARM-based Mont-Blanc architecture. Future Gener. Comput. Syst. 79, 786–796 (2018).  https://doi.org/10.1016/j.future.2017.09.029CrossRefGoogle Scholar
  7. 7.
    Stegailov, V., Smirnov, G., Vecher, V.: Vasp hits the memory wall: processors efficiency comparison. Concurr. Comput. Pract. Exp. 31(19), e5136 (2019).  https://doi.org/10.1002/cpe.5136CrossRefGoogle Scholar
  8. 8.
    Hammond, S., et al.: Evaluating the Marvell ThunderX2 Server Processor for HPC Workloads (2019). https://cfwebprod.sandia.gov/cfdocs/CompResearch/docs/bench2019.pdf
  9. 9.
    Banchelli, F., et al.: MB3 D6.9 - performance analysis of applications and mini-applications and benchmarking on the project test platforms. Technical report, Mont-Blanc Project, Version 1.0 (2019)Google Scholar
  10. 10.
    McIntosh-Smith, S., Price, J., Deakin, T., Poenaru, A.: A performance analysis of the first generation of HPC-optimized arm processors. Concurr. Comput. Pract. Exp., e5110 (2018).  https://doi.org/10.1002/cpe.5110
  11. 11.
    Ciznicki, M., Kurowski, K., Weglarz, J.: Energy aware scheduling model and online heuristics for stencil codes on heterogeneous computing architectures. Cluster Comput. 20(3), 2535–2549 (2017).  https://doi.org/10.1007/s10586-016-0686-2CrossRefGoogle Scholar
  12. 12.
    Yount, C., Tobin, J., Breuer, A., Duran, A.: Yask—yet another stencil kernel: a framework for HPC stencil code-generation and tuning. In: Sixth International Workshop on Domain-Specific Languages and High-Level Frameworks for HPC (WOLFHPC), pp. 30–39 (2016).  https://doi.org/10.1109/WOLFHPC.2016.08
  13. 13.
    Pereira, A.D., Ramos, L., Góes, L.F.W.: PSkel: a stencil programming framework for CPU-GPU systems. Concurr. Comput. Pract. Exp. 27(17), 4938–4953 (2015).  https://doi.org/10.1002/cpe.3479CrossRefGoogle Scholar
  14. 14.
    Calore, E., Gabbana, A., Schifano, S.F., Tripiccione, R.: Optimization of lattice Boltzmann simulations on heterogeneous computers. Int. J. High Perform. Comput. Appl., 1–16 (2017).  https://doi.org/10.1177/1094342017703771
  15. 15.
    Bonati, C., et al.: Portable multi-node LQCD Monte Carlo simulations using OpenACC. Int. J. Mod. Phys. C 29(1) (2018).  https://doi.org/10.1142/S0129183118500109
  16. 16.
    Shet, A.G., et al.: On vectorization for lattice based simulations. Int. J. Mod. Phys. C 24 (2013).  https://doi.org/10.1142/S0129183113400111
  17. 17.
    Joó, B., Kalamkar, D.D., Kurth, T., Vaidyanathan, K., Walden, A.: Optimizing Wilson-Dirac operator and linear solvers for Intel® KNL. In: Taufer, M., Mohr, B., Kunkel, J.M. (eds.) ISC High Performance 2016. LNCS, vol. 9945, pp. 415–427. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46079-6_30CrossRefGoogle Scholar
  18. 18.
    Calore, E., Gabbana, A., Schifano, S.F., Tripiccione, R.: Early experience on using Knights Landing processors for Lattice Boltzmann applications. In: Wyrzykowski, R., Dongarra, J., Deelman, E., Karczewski, K. (eds.) PPAM 2017. LNCS, vol. 10777, pp. 519–530. Springer, Cham (2018).  https://doi.org/10.1007/978-3-319-78024-5_45CrossRefGoogle Scholar
  19. 19.
    Williams, S., Waterman, A., Patterson, D.: Roofline: an insightful visual performance model for multicore architectures. Commun. ACM 52(4), 65–76 (2009).  https://doi.org/10.1145/1498765.1498785CrossRefGoogle Scholar
  20. 20.
    Gwennap, L.: ThunderX rattles server market. Microproc. Rep. 29(6), 1–4 (2014)Google Scholar
  21. 21.
    McCalpin, J.D.: Stream: sustainable memory bandwidth in high performance computers (2019). https://www.cs.virginia.edu/stream/. Accessed 14 Apr 2019
  22. 22.
    Marvell: ThunderX2 arm-based processors (2019). https://www.marvell.com/products/server-processors/thunderx2-arm-processors.html. Accessed 18 Apr 2019
  23. 23.
    Biferale, L., Mantovani, F., Sbragaglia, M., Scagliarini, A., Toschi, F., Tripiccione, R.: Second-order closure in stratified turbulence: simulations and modeling of bulk and entrainment regions. Phys. Rev. E 84(1), 016305 (2011).  https://doi.org/10.1103/PhysRevE.84.016305CrossRefzbMATHGoogle Scholar
  24. 24.
    Calore, E., Gabbana, A., Kraus, J., Schifano, S.F., Tripiccione, R.: Performance and portability of accelerated lattice Boltzmann applications with OpenACC. Concurr. Comput. Pract. Exp. 28(12), 3485–3502 (2016).  https://doi.org/10.1002/cpe.3862CrossRefGoogle Scholar
  25. 25.
    DeGrand, T., DeTar, C.: Lattice Methods for Quantum ChromoDynamics. World Scientific (2006).  https://doi.org/10.1142/6065
  26. 26.
    Bonati, C., et al.: Design and optimization of a portable LQCD Monte Carlo code using OpenACC. Int. J. Mod. Phys. C 28(5) (2017).  https://doi.org/10.1142/S0129183117500632
  27. 27.
    Bonati, C., et al.: Early experience on running OpenStaPLE on DAVIDE. In: Yokota, R., Weiland, M., Shalf, J., Alam, S. (eds.) ISC High Performance 2018. LNCS, vol. 11203, pp. 387–401. Springer, Cham (2018).  https://doi.org/10.1007/978-3-030-02465-9_26CrossRefGoogle Scholar
  28. 28.
    Lo, Y.J., et al.: Roofline model toolkit: a practical tool for architectural and program analysis. In: Jarvis, S.A., Wright, S.A., Hammond, S.D. (eds.) PMBS 2014. LNCS, vol. 8966, pp. 129–148. Springer, Cham (2015).  https://doi.org/10.1007/978-3-319-17248-4_7CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  • Enrico Calore
    • 2
  • Alessandro Gabbana
    • 1
    • 2
  • Fabio Rinaldi
    • 1
  • Sebastiano Fabio Schifano
    • 1
    • 2
    Email author
  • Raffaele Tripiccione
    • 1
    • 2
  1. 1.Università degli Studi di FerraraFerraraItaly
  2. 2.INFN Sezione di FerraraFerraraItaly

Personalised recommendations