Adapting Memory Hierarchies for Emerging Datacenter Interconnects

Jiang, Tao; Hou, Rui; Dong, Jian-Bo; Chai, Lin; McKee, Sally A.; Tian, Bin; Zhang, Li-Xin; Sun, Ning-Hui

doi:10.1007/s11390-015-1507-4

Adapting Memory Hierarchies for Emerging Datacenter Interconnects

Regular Paper
Published: 21 January 2015

Volume 30, pages 97–109, (2015)
Cite this article

Journal of Computer Science and Technology Aims and scope Submit manuscript

Tao Jiang^1,2,
Rui Hou¹,
Jian-Bo Dong¹,
Lin Chai^1,2,
Sally A. McKee³,
Bin Tian⁴,
Li-Xin Zhang¹ &
…
Ning-Hui Sun¹

158 Accesses
1 Citation
Explore all metrics

Abstract

Efficient resource utilization requires that emerging datacenter interconnects support both high performance communication and efficient remote resource sharing. These goals require that the network be more tightly coupled with the CPU chips. Designing a new interconnection technology thus requires considering not only the interconnection itself, but also the design of the processors that will rely on it. In this paper, we study memory hierarchy implications for the design of high-speed datacenter interconnects — particularly as they affect remote memory access — and we use PCIe as the vehicle for our investigations. To that end, we build three complementary platforms: a PCIe-interconnected prototype server with which we measure and analyze current bottlenecks; a software simulator that lets us model microarchitectural and cache hierarchy changes; and an FPGA prototype system with a streamlined switchless customized protocol Thunder with which we study hardware optimizations outside the processor. We highlight several architectural modifications to better support remote memory access and communication, and quantify their impact and limitations.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Interconnect Modeling for Homogeneous and Heterogeneous Multiprocessors

Prototyping a Configurable Cache/Scratchpad Memory with Virtualized User-Level RDMA Capability

OMHI 2012: First International Workshop on On-chip Memory Hierarchies and Interconnects: Organization, Management and Implementation

References

Benson T, Akella A, Maltz D. Network traffic characteristics of data centers in the wild. In Proc. the 10th ACM SIG-COMM Conf. Internet Measurement, Nov. 2010, pp.267-280.
Regula J. Integrating rack level connectivity into a PCI Express switch. In Proc. Hot Chips: A Symposium on High Performance Chips, Aug. 2013, pp.259–266.
Pfister G. An introduction to the InfiniBand™ architecture. In High Performance Mass Storage and Parallel I/O, Cortes T, Jin H, Buyya R (eds.), John Wiley & Sons, 2001, pp.617-632.
Hou R, Jiang T, Zhang L, Qi P, Dong J, Wang H, Gu X, Zhang S. Cost effective data center servers. In Proc. the 19th IEEE Int. Symp. High Performance Computer Architecture, Feb. 2013, pp.179-187.
Léon E, Riesen R, Ferreira K, Maccabe A. Cache injection for parallel applications. In Proc. the 20th ACM Int. Symp. High Performance Distributed Computing, Jun. 2011, pp.15-26.
Brown J, Woodward S, Bass B, Johnson C. IBM power edge of network processor: A wire-speed system on a chip. IEEE Micro, 2011, 31(2): 76-85.
Article Google Scholar
Binkert N, Beckmann B, Black G et al. The gem5 simulator. ACM SIGARCH Comput. Archit. News, 2011, 39(2): 1-7.
Article Google Scholar
Hurwitz J, Feng W. End-to-end performance of 10-Gigabit Ethernet on commodity systems. IEEE Micro, 2004, 24(1): 10-12.
Article Google Scholar
Deshpande U, Wang B, Haque S, Hines M, Gopalan K. MemX: Virtualization of cluster-wide memory. In Proc. the 39th International Conference on Parallel Processing, Sept. 2010, pp.663-672.
Lim K, Chang J, Mudge T, Ranganathan P, Reinhardt S, Wenisch T. Disaggregated memory for expansion and sharing in blade servers. In Proc. the 36th International Symposium on Computer Architecture, Jun. 2009, pp. 267-278.
Novakovic S, Daglis A, Bugnion E, Falsafi B, Grot B. Scaleout NUMA. In Proc. the 19th International Conference on Architectural Support for Programming Languages and Operating Systems, Feb. 2014, pp.3-18.

Download references

Author information

Authors and Affiliations

State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, 100190, China
Tao Jiang, Rui Hou, Jian-Bo Dong, Lin Chai, Li-Xin Zhang & Ning-Hui Sun
University of Chinese Academy of Sciences, Beijing, 100049, China
Tao Jiang & Lin Chai
Computer Science and Engineering, Chalmers University of Technology, Gothenburg, 41296, Sweden
Sally A. McKee
National High Performance Integrated Circuit Design Center (Shanghai), Shanghai, 201204, China
Bin Tian

Authors

Tao Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Rui Hou
View author publications
You can also search for this author in PubMed Google Scholar
Jian-Bo Dong
View author publications
You can also search for this author in PubMed Google Scholar
Lin Chai
View author publications
You can also search for this author in PubMed Google Scholar
Sally A. McKee
View author publications
You can also search for this author in PubMed Google Scholar
Bin Tian
View author publications
You can also search for this author in PubMed Google Scholar
Li-Xin Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Ning-Hui Sun
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tao Jiang.

Additional information

This work was supported by the Strategic Priority Research Program of the Chinese Academy of Sciences under Grant No. XDA06010401, and the National Natural Science Foundation of China under Grant Nos. 61100010, 61402438, and 61402439.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jiang, T., Hou, R., Dong, JB. et al. Adapting Memory Hierarchies for Emerging Datacenter Interconnects. J. Comput. Sci. Technol. 30, 97–109 (2015). https://doi.org/10.1007/s11390-015-1507-4

Download citation

Received: 14 July 2014
Revised: 15 December 2014
Published: 21 January 2015
Issue Date: January 2015
DOI: https://doi.org/10.1007/s11390-015-1507-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Adapting Memory Hierarchies for Emerging Datacenter Interconnects

Abstract

Access this article

Similar content being viewed by others

Interconnect Modeling for Homogeneous and Heterogeneous Multiprocessors

Prototyping a Configurable Cache/Scratchpad Memory with Virtualized User-Level RDMA Capability

OMHI 2012: First International Workshop on On-chip Memory Hierarchies and Interconnects: Organization, Management and Implementation

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Adapting Memory Hierarchies for Emerging Datacenter Interconnects

Abstract

Access this article

Similar content being viewed by others

Interconnect Modeling for Homogeneous and Heterogeneous Multiprocessors

Prototyping a Configurable Cache/Scratchpad Memory with Virtualized User-Level RDMA Capability

OMHI 2012: First International Workshop on On-chip Memory Hierarchies and Interconnects: Organization, Management and Implementation

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation