http://mvapich.cse.ohio-state.edu/benchmarks
High-Performance Center Overview. https://www.hpcadvisorycouncil.com/cluster_center.php
Mellanox BlueField. https://docs.mellanox.com/x/iQO3
Panda, D.K., Subramoni, H., Chu, C.H., Bayatpour, M.: The MVAPICH project: Transforming research into high-performance MPI library for HPC community. J. Comput. Sci. 101208 (2020)
Google Scholar
Bayatpour, M., et al.: Communication-aware hardware-assisted MPI overlap engine. In: Sadayappan, P., Chamberlain, B.L., Juckeland, G., Ltaief, H. (eds.) ISC High Performance 2020. LNCS, vol. 12151, pp. 517–535. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-50743-5_26
CrossRef
Google Scholar
Bayatpour, M., Ghazimirsaeed, S.M., Xu, S., Subramoni, H., Panda, D.K.: Design and characterization of infiniband hardware tag matching in MPI. In: 20th Annual IEEE/ACM CCGRID (2020)
Google Scholar
Gropp, W., Lusk, E., Doss, N., Skjellum, A.: A high-performance, portable implementation of the MPI, message passing interface standard. Technical report, Argonne National Laboratory and Mississippi State University
Google Scholar
InfiniBand Trade Association (2017). http://www.infinibandta.com
Kandalla, K., Subramoni, H., Tomko, K., Pekurovsky, D., Sur, S., Panda, D.K.: High-performance and scalable non-blocking all-to-all with collective offload on infiniband clusters: a study with parallel 3D FFT. Comput. Sci. 26, 237–246 (2011)
Google Scholar
Liu, M., Cui, T., Schuh, H., Krishnamurthy, A., Peter, S., Gupta, K.: iPipe: a framework for building distributed applications on SmartNICs. In: SIGCOMM 2019: Proceedings of the ACM Special Interest Group on Data Communication, pp. 318–333 (2019). https://doi.org/10.1145/3341302.3342079
Network-Based Computing Laboratory: MVAPICH2-X (Unified MPI+PGAS Communication Runtime over OpenFabrics/Gen2 for Exascale Systems). http://mvapich.cse.ohio-state.edu/overview/mvapich2x/
Pekurovsky, D.: P3DFFT library (2006–2009). www.sdsc.edu/us/resources/p3dfft/
Phothilimthana, P.M., Liu, M., Kaufmann, A., Simon Peter, R.B., Anderson, T.: Floem: a programming system for NIC-accelerated network applications. In: Proceedings of the 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST). 13th USENIX Symposium on Operating Systems Design and Implementation (2018)
Google Scholar
Potluri, S., et al.: MVAPICH-PRISM: a proxy-based communication framework using InfiniBand and SCIF for Intel MIC clusters. In: Proceedings of SC 2013, SC 2013, pp. 54:1–54:11 (2013)
Google Scholar
Scalable hierarchical aggregation protocol: scalable hierarchical aggregation protocol. https://www.mellanox.com/products/sharp
Subramoni, H., Kandalla, K., Sur, S., Panda, D.K.: Design and evaluation of generalized collective communication primitives with overlap using ConnectX-2 offload engine. In: Internationall Symposium on Hot Interconnects (HotI), August 2010 (2010)
Google Scholar
Subramoni, H., et al.: Designing non-blocking personalized collectives with near perfect overlap for RDMA-enabled clusters. In: Kunkel, J.M., Ludwig, T. (eds.) ISC High Performance 2015. LNCS, vol. 9137, pp. 434–453. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-20119-1_31
CrossRef
Google Scholar
Sur, S., Jin, H.W., Chai, L., Panda, D.K.: RDMA read based rendezvous protocol for MPI over infiniband: design alternatives and benefits. In: Proceedings of the Eleventh ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP 2006 (2006)
Google Scholar