Abstract
The main goal of this paper is to demonstrate that the newest generation of NEC SX-Aurora TSUBASA architecture can perform large-scale graph processing extremely efficiently. This paper proposes approaches, which can be used for the development of high-performance vector-oriented implementations of page rank and shortest paths algorithms, including vectorised graph storage format, efficient vector-friendly graph traversals, optimised cache-aware memory accesses and efficient load-balancing. The developed implementations are optimised according to the most important features and properties of SX-Aurora architecture, which allows them achieve up to 15 times better performance compared to the optimised Intel Skylake parallel implementations and up to 5 times better performance compared to NVGRAPH library implementations for Pascal GPU architecture.
Similar content being viewed by others
References
K. Komatsu, S. Momose, Y. Isobe, O. Watanabe, A. Musa, M. Yokokawa, T. Aoyama, M. Sato, and H. Kobayashi, in Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis (IEEE, Piscataway, NJ, USA, 2018), SC'18, pp. 54:1–54:12. http://dl.acm.org/citation.cfm?id=3291656.3291728.
Y. Yamada and S. Momose, in Proceedings of the Intenational Symposium on High Performance Chips Hot Chips2018 (2018).
R. Egawa, K. Komatsu, S. Momose, Y. Isobe, A. Musa, H. Takizawa, and H. Kobayashi, J. Supercomput. 73, 3948 (2017). https://doi.org/10.1007/s11227-017-1993-y
K. Komatsu, R. Egawa, Y. Isobe, R. Ogata, H. Takizawa, and H. Kobayashi, in Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis (SC15) (2015), pp. 1–2.
D. Chakrabarti, Y. Zhan, and C. Faloutsos, in Proceedings of the 2004 SIAM International Conference on Data Mining (SIAM, 2004), pp. 442–446.
J. Leskovec and A. Krevl, SNAP Datasets: Stanford Large Network Dataset Collection (2014). http://snap.stanford.edu/data.
J. Kunegis, in Proceedings of the International Conference on World Wide Web Companion (2013), pp. 1343–1350, http://userpages.uni-koblenz.de/kunegis/paper/kunegis-koblenz-network-collection.pdf.
I. V. Afanasyev, A. S. Antonov, D. A. Nikitenko, V. V. Voevodin, V. V. Voevodin, K. Komatsu, O. Watanabe, A. Musa, and H. Kobayashi, Supercomput. Front. Innov. 5, 65 (2018).
F. Busato and N. Bombieri, IEEE Trans. Parallel Distrib. Syst. 27, 2222 (2015).
A. Davidson, S. Baxter, M. Garland, and J. D. Owens, in Proceedings of the 2014 IEEE 28th International Parallel and Distributed Processing Symposium (IEEE, 2014), pp. 349–359.
M. Besta, F. Marending, E. Solomonik, and T. Hoefler, in Proceedings of the IEEE IPDPS (2017), vol. 17.
S. Brin and L. Page, Comput. Networks ISDN Syst. 30, 107 (1998).
Y. Wang, A. Davidson, Y. Pan, Y. Wu, A. Riffel, and J. D. Owens, in ACM SIGPLAN Notices (ACM, 2016), vol. 51, p. 11.
P. Choudhari, E. Baikampadi, P. Patil, and S. Gadekar, Int. J. Comput. Sci. Inform. Technol. 6 (2015).
R. Wang, W. Zhang, H. Deng, N. Wang, Q. Miao, and X. Zhao, in Proceedings of the International Conference in Swarm Intelligence (Springer, 2013), pp. 154–162.
Funding
This project was partially supported by JSPS Bilateral Joint Research Projects program, entitled “Theory and Practice of Vector Data Processing at Extreme Scale: Back to the Future”. The reported study was supported by the Russian Foundation for Basic Research, project no. 18-57-50005.
Author information
Authors and Affiliations
Corresponding authors
Additional information
Submitted by E. E. Tyrtyshnikov
Rights and permissions
About this article
Cite this article
Afanasyev, I.V., Voevodin, V.V., Voevodin, V.V. et al. Developing Efficient Implementations of Shortest Paths and Page Rank Algorithms for NEC SX-Aurora TSUBASA Architecture. Lobachevskii J Math 40, 1753–1762 (2019). https://doi.org/10.1134/S1995080219110039
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1134/S1995080219110039