Evaluation of the Suitability of Intel Xeon Phi Clusters for the Simulation of Ultrasound Wave Propagation Using Pseudospectral Methods
The ability to perform large-scale ultrasound simulations using Fourier pseudospectral methods has generated significant interest in medical ultrasonics, including for treatment planning in therapeutic ultrasound and image reconstruction in photoacoustic tomography. However, the routine execution of such simulations is computationally very challenging. Nowadays, the trend in parallel computing is towards the use of accelerated clusters where computationally intensive parts are offloaded from processors to accelerators. During last five years, Intel has released two generations of Xeon Phi accelerators. The goal of this paper is to investigate the performance on both architectures with respect to current processors, and evaluate the suitability of accelerated clusters for the distributed simulation of ultrasound propagation using Fourier-based methods. The paper reveals that the former version of Xeon Phis, the Knight’s Corner architecture, suffers from several flaws that reduce the performance far below the Haswell processors. On the other hand, the second generation called Knight’s Landing shows very promising performance comparable with current processors.
KeywordsUltrasound simulations Pseudospectral methods k-Wave toolbox Intel Xeon Phi KNC KNL MPI OpenMP Performance evaluation Scaling
This work was supported by The Ministry of Education, Youth and Sports from the National Programme of Sustainability (NPU II) project IT4Innovations excellence in science - LQ1602 and by the IT4Innovations infrastructure which is supported from the Large Infrastructures for Research, Experimental Development and Innovations project IT4Innovations National Supercomputing Center - LM2015070. This project has received funding from the European Union’s Horizon 2020 research and innovation programme H2020 ICT 2016–2017 under grant agreement No 732411 and is an initiative of the Photonics Public Private Partnership. This work was also supported by the Engineering and Physical Sciences Research Council, UK, grant numbers EP/L020262/1 and EP/P008860/1.
- 1.Andrews, L.C.: Special Functions of Mathematics for Engineers. SPIE Pub. (1997)Google Scholar
- 7.Gholami, A., Hill, J., Malhotra, D., Biros, G.: AccFFT: a library for distributed-memory FFT on CPU and GPU architectures, May 2016. http://arxiv.org/abs/1506.07933v3
- 9.Intel Corporation: Math Kernel Library 11.3 Developer Reference. Intel Corporation (2015)Google Scholar
- 12.Jaros, J., Vaverka, F., Treeby, B.E.: Spectral domain decomposition using local fourier basis: application to ultrasound simulation on a cluster of GPUs. Supercomput. Frontiers Innov. 3(3), 40–55 (2016)Google Scholar
- 13.Jeffers, J., Reinders, J.: Intel Xeon Phi Coprocessor High Performance Programming, vol. 1. Elsevier Inc., Waltham (2013)Google Scholar
- 15.Nandapalan, N., Jaros, J., Treeby, E.B., AlistairRendell, P.: Implementation of 3D FFTs across multiple GPUs in shared memory environments. In: Proceedings of the Thirteenth International Conference on Parallel and Distributed Computing, Applications and Technologies, pp. 167–172 (2012). https://doi.org/10.1109/PDCAT.2012.79. http://www.fit.vutbr.cz/research/view_pub.php?id=10171
- 22.Treeby, B.E., Vaverka, F., Jaros, J.: Performance and accuracy analysis of nonlinear k-wave simulations using local domain decomposition with an 8-GPU server. Proc. Meet. Acoust. 34(1), 022002 (2018)Google Scholar