Advertisement

Cluster Computing

, Volume 22, Supplement 1, pp 1815–1826 | Cite as

Design and implementation of initial OpenSHMEM on PCIe NTB based cloud computing

  • Cheol Shim
  • Kwang-ho Cha
  • Min ChoiEmail author
Article
  • 92 Downloads

Abstract

Cloud computing services are provided by key roles of data centers in which homogeneous and heterogeneous computation nodes are connected by high speed interconnection network. The rapid development of cloud-based services and applications has made data center networks more difficult. The PCI Express is a widely used system bus technology that connects processors and peripheral I/O devices. So, the PCI Express is regarded as a de-facto standard in system area interconnection network. It is currently validating the possibility of using PCI Express as a system interconnection network in areas such as high-performance computers and cluster/cloud computing. With the development of PCI Express non-transparent bridge (NTB) technology, the PCI Express has become available as a system interconnection network. NTB allows two PCI Express subsystems to be interconnected and, if necessary, isolated from each other. Partitioned global address space (PGAS) is one of the shared address space programming models. Due to the recent spread of multicore processors, PGAS has been attracting attention as a parallel computing framework. We make use of the PCI Express NTB to realize the PGAS shared address space model. In this paper, we designed and implemented the interconnection network using PCI Express x8 using a RDK, the PEX8749 based PCI Express evaluation board. We performed some Openshmem applications from Github to verify the accuracy of our initial OpenSHMEM API implementation.

Keywords

PCI Express Non-transparent bridge Interconnection network RDMA One-sided communication 

Notes

Acknowledgements

This research has been performed as a subproject of Project No. K-17-L01-C01 (Development of key enabling technologies for massively parallel and high-density computing) supported by the KOREA INSTITUTE of SCIENCE and TECHNOLOGY INFORMATION (KISTI). This research is also jointly Supported by the Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education, Science and Technology (NRF-2016R1C1B1012189).

References

  1. 1.
    Helal, A.A., Kim, Y.W., Ren, Y., Choi, W.H.: Design and implementation of an alternate system inter-connect based on PCI Express. J. Inst. Electron. Inform. Eng. 52(8), 74–85 (2015)Google Scholar
  2. 2.
    Liu, J., Mamidala, A., Vishnu, A., Panda, D.K.: Evaluating infiniband performance with PCI Express. IEEE Micro 24(1), 20–29 (2005)Google Scholar
  3. 3.
    Heymian, W.: PCI Express multi-root switch reconfiguration during system operation. M. Eng. Thesis, Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, Cambridge (2011)Google Scholar
  4. 4.
    Krishnan, V.: Towards an integrated IO and clustering solution using PCI express. 2007 IEEE International Conference on Cluster Computing [Online]. pp. 259–266. http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=4629239 (2007)
  5. 5.
    Mohrmann, L., Tongen, J., Friedman, M., Wetzel, M.: Creating multicomputer test systems using PCI and PCI Express. IEEE AUTOTESTCON [Online]. pp. 7–10. http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=5314043 (2009)
  6. 6.
    RCT Magazine. Enabling Multi-Host System Designs with PCI Express Technology [Internet]. http://www.rtcmagazine.com/articles/view/100015.8
  7. 7.
    Jo, H., Jeong, J., Lee, M., Choi, D.: Exploiting GPUs in Virtual Machine for BioCloud, BioMed Research International, vol. 2013, Article ID 939460 (2013)Google Scholar
  8. 8.
    Vienne, J., Chen, J., Wasi-ur-Rahman, M., Islam, N., Subramoni, H., Panda, D.: Performance Analaysis and Evaluation of Infiniband FDR and 40GigE RoCE on HPC and Cloud Computing Systems, IEEE 20\(^{th}\) Annual Symposium on High-Performance Interconnects (2012)Google Scholar
  9. 9.
    Choi, M., Park, J.H.: Feasibility and performance analysis of RDMA transfer through PCI Express. J. Inform. Process. Syst. 13(1), 95–103 (2017)Google Scholar
  10. 10.
    Top500.org.: Interconnect Family Statistics [Internet]. http://top500.org/statistics/list (2015)
  11. 11.
    Chen, L., Zhou, W., Wu, Q.: A design of high-speed image acquisition card based on PCI EXPRESS. 2010 International Conference on Computer Application and System Modeling (ICCASM 2010) [Online]. vol, 3, pp. V3-551–V3-554. http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=5620696 (2010)
  12. 12.
    Kavianipour, H., Bohm, C.:. High performance FPGA-based scat-ter/gather DMA interface for PCIe. Nuclear Science Symposium and Medical Imaging Confer-ence (NSS/MIC), 2012 IEEE [Online]. pp. 1517–1520. http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6551364 (Oct. 27 2012–Nov. 3 2012)
  13. 13.
    Buschettu, A., Sanna, D., Concas, G., Eros, F.: A platform based on kanban to build taxonomies and folksonomies for DMS and CSS. J. Converg. (JoC) 6(1), 1–8 (2015)Google Scholar
  14. 14.
    Rota, L., Caselle, M., Chilingaryan, S., Kopmann, A., Weber, M.: A new DMA PCIe architecture for Gigabyte data transmission. Real Time Conference (RT), 2014 19th IEEE-NPSS [Online]. pp. 1–2. http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=7097561 (2014)
  15. 15.
    ExpressLane PEX8749 PCI Express Gen 3 Multi-Root Switch with DMA Data Book, PLX Technology (2013)Google Scholar
  16. 16.
    Richter, A., Herber, C., Wild, T., Herkersdorf, A.: Resolveing Performance Interference in SR-IOV Setups with PCIe Quality-of-Service Extensions. Euromicro Conference on Digital System Design (2016)Google Scholar
  17. 17.
    Sinha, A., Lobiyal, D.K.: Performance evaluation of data aggregation for cluster wireless sensor network. Human-centric Comput. Inform. Sci. (HCIS) 3(13), 2–17 (2013)Google Scholar
  18. 18.
    Naoui, M., Mahmoudi, S., Belalem, G.: Feasibility study of a distributed and parallel environment for implementing the standard version of AAM model. J. Inform. Process. Syst. (JIPS) 12(1), 149–168 (2016)Google Scholar
  19. 19.
    Mellor-crummey, J.M.: Algorithm for scalable synchronization on shared-memory multiprocessors. ACM Trans. Comput. Syst. 9(1), 21–65 (1991)CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Department of Information and Communication EngineeringChungbuk National UniversityCheongjuKorea
  2. 2.Center for Supercomputer DevelopmentKorea Institute of Science and Technology InformationDaejonKorea

Personalised recommendations