QCD Library for GPU Cluster with Proprietary Interconnect for GPU Direct Communication

  • Norihisa Fujita
  • Hisafumi Fujii
  • Toshihiro Hanawa
  • Yuetsu Kodama
  • Taisuke Boku
  • Yoshinobu Kuramashi
  • Mike Clark
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8805)

Abstract

QUDA is a Lattice QCD library that can use NVIDIA’s Graphics Processing Unit (GPU) accelerators, and is widely used as a framework for Lattice QCD applications. In this paper, we apply our novel proprietary interconnect network called the Tightly Coupled Accelerators (TCA) architecture, to inter-node GPU communication in QUDA. The TCA architecture was developed for low-latency inter-node communication among accelerators connected through the PCI Express (PCIe) bus on PC clusters. It enables direct memory copy between accelerators, such as GPUs, over nodes in the same manner as an intra-node PCIe transaction. We assess the performance of TCA on QUDA by a high-density GPU cluster HA-PACS/TCA, which is a proof-of-concept testbed for TCA architecture. The results show that our interconnection network system, which effects a stronger scaling than ordinary InfiniBand solutions on PC clusters with GPUs, significantly reduces communication latency. The execution time for Conjugate Gradient (CG) iteration shows that the TCA implementation is 2.14 times faster than peer-to-peer MPI implementation and 1.96 times faster than MPI remote-memory access (RMA) implementation, where InfiniBand QDRx2 rail network is used in both cases.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Top500 Supercomputer Sites, http://top500.org/
  2. 2.
    PGI-SIG. PCI Express Base Specification, Rev. 3.0 (2010)Google Scholar
  3. 3.
    Clark, M.A., Babich, R., Barros, K., Brower, R.C., Rebbi, C.: Solving Lattice QCD systems of equations using mixed precision solvers on GPUs. Comput. Phys. Commun. 181, 1517–1528 (2010)CrossRefMATHGoogle Scholar
  4. 4.
    QUDA - A Library for QCD on GPUs, http://lattice.github.io/quda/
  5. 5.
    Babich, R., Clark, M.A., Joo, B., Shi, G., Brower, R.C., Gottlieb, S.: Scaling lattice QCD beyond 100 GPUs. In: International Conference for High Performance Computing, Networking, Storage and Analysis (SC) (2011)Google Scholar
  6. 6.
    Message Passing Interface (MPI) Forum Home Page, http://www.mpi-forum.org/
  7. 7.
    Lattice QCD Message Passing (QMP), http://usqcd.jlab.org/usqcd-docs/qmp/
  8. 8.
    Hanawa, T., Kodama, Y., Boku, T., Sato, M.: Interconnect for Tightly Coupled Accelerators Architecture. In: IEEE 21st Annual Symposium on High-Performance Interconnects (HOT Interconnects 21), pp. 79–82 (2013)Google Scholar
  9. 9.
    Hanawa, T., Kodama, Y., Boku, T., Sato, M.: Tightly Coupled Accelerators Architecture for Minimizing Communication Latency among Accelerators. In: The Third International Workshop on Accelerators and Hybrid Exascale Systems, AsHES (2013)Google Scholar
  10. 10.
    PGI-SIG. PCI Express External Cabling Specification, Rev. 1.0 (2007)Google Scholar
  11. 11.
    NVIDIA GPUDirect | NVIDIA Developer Zone, https://developer.nvidia.com/gpudirect.
  12. 12.
  13. 13.
    Ammendola, R., Biagioni, A., Frezza, O., Lo, F.: APEnet+: high bandwidth 3D torus direct network for petaflops scale commodity clusters. J. Phys. Conf (2011)Google Scholar
  14. 14.
    Ammendola, R., Biagioni, A., Frezza, O., Lo, F.: APEnet+: a 3D Torus network optimized for GPU-based HPC Systems. J. Phys. Conf. (2012)Google Scholar
  15. 15.
    Mellanox Products: Mellanox OFED GPUDirect RDMA Beta, http://www.mellanox.com/page/products_dyn?product_family=116.
  16. 16.
  17. 17.

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Norihisa Fujita
    • 1
  • Hisafumi Fujii
    • 1
  • Toshihiro Hanawa
    • 2
  • Yuetsu Kodama
    • 3
  • Taisuke Boku
    • 1
    • 3
  • Yoshinobu Kuramashi
    • 3
  • Mike Clark
    • 4
  1. 1.Graduate School of Systems and Information EngineeringUniversity of TsukubaTsukubaJapan
  2. 2.Information Technology CenterThe University of TokyoJapan
  3. 3.Center for Computational SciencesUniversity of TsukubaTsukubaJapan
  4. 4.NVIDIA CorporationUSA

Personalised recommendations