Analysis of Relationship Between SIMD-Processing Features Used in NVIDIA GPUs and NEC SX-Aurora TSUBASA Vector Processors

  • Ilya V. AfanasyevEmail author
  • Vadim V. Voevodin
  • Vladimir V. Voevodin
  • Kazuhiko Komatsu
  • Hiroaki Kobayashi
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11657)


This paper presents comprehensive analysis of main SIMD-processing features and computational characteristics of three high performance architectures: two NVIDIA GPU architectures (of Pascal and Volta generations) and NEC SX-Aurora TSUBASA vector processor. Since both these types of architectures strongly rely on using SIMD-processing features, certain similarities of data-processing principles can be found between them. However, despite having vectorised data-processing included in both NVIDIA GPU and NEC SX-Aurora TSUBASA architectures, vectorisation features of both architectures are implemented in completely different ways. These differences lead to several fundamental restrictions on classes of algorithms which can be efficiently implemented on corresponding platforms. This paper is devoted to the research of the possibility of porting various classes of programs and algorithms among the discussed architectures with a focus on utilising all vectorisation features available. However, without a detailed analysis of similar and different SIMD-processing features in these architectures, it is impossible to approach this problem. The performed analysis allowed us to identify several important examples of typical applications and algorithms. Some of them demonstrated comparable and the others showed different efficiency on NVIDIA GPUs and NEC SX-Aurora TSUBASA vector processors, including reduction operations, programs relying on frequent indirect memory accesses and data-transfers through co-processor interconnect. Moreover, the conducted analysis allows to easily extend this set of examples to approach the problem of automated porting of programs between the reviewed architectures, what we consider as an important direction of our future research.


NEC SX-Aurora TSUBASA NVIDIA GPU Vector processing SIMD 


  1. 1.
  2. 2.
    Thrust Library.
  3. 3.
    Egawa, R., et al.: Potential of a modern vector supercomputer for practicalapplications: performance evaluation of SX-ACE. J. Supercomput. 73(9), 3948–3976 (2017). Scholar
  4. 4.
    Flynn, M.J.: Very high-speed computing systems. Proc. IEEE 54(12), 1901–1909 (1966)CrossRefGoogle Scholar
  5. 5.
    Harris, M., et al.: Optimizing parallel reduction in CUDA. Nvidia Dev. Technol. 2(4), 70 (2007)Google Scholar
  6. 6.
    Komatsu, K., Egawa, R., Isobe, Y., Ogata, R., Takizawa, H., Kobayashi, H.: An approach to the highest efficiency of the HPCG benchmark on the SX-ACE supercomputer. In: Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis (SC15), Poster, pp. 1–2, November 2015Google Scholar
  7. 7.
    Komatsu, K., et al.: Performance evaluation of a vector supercomputer SX-aurora TSUBASA. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis, SC 2018, pp. 54:1–54:12. IEEE Press, Piscataway (2018).
  8. 8.
    NVIDIA: Nvidia Tesla P100: The most advanced datacenter accelerator ever built featuring Pascal GP100, the world’s fastest GPU. Whitepaper (2016)Google Scholar
  9. 9.
    NVIDIA Tesla: V100 GPU architecture (2017)Google Scholar
  10. 10.
    Wu, B., Zhao, Z., Zhang, E.Z., Jiang, Y., Shen, X.: Complexity analysis and algorithm design for reorganizing data to minimize non-coalesced memory accesses on GPU. In: ACM SIGPLAN Notices, vol. 48, pp. 57–68. ACM (2013)Google Scholar
  11. 11.
    Yamada, Y., Momose, S.: Vector engine processor of NECs brand-new supercomputer SX-aurora TSUBASA. In: Intenational Symposium on High Performance Chips (Hot Chips 2018) (2018)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Research Computing Center of Moscow State UniversityMoscowRussia
  2. 2.Tohoku UniversitySendaiJapan

Personalised recommendations