Skip to main content

Advertisement

Log in

Implications of shallower memory controller transaction queues in scalable memory systems

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Scalable memory systems provide scalable bandwidth to the core growth demands in multicores and embedded systems processors. In these systems, as memory controllers (MCs) are scaled, memory traffic per MC is reduced, so transaction queues become shallower. As a consequence, there is an opportunity to explore transaction queue utilization and its impact on energy utilization. In this paper, we propose to evaluate the performance and energy-per-bit impact when reducing transaction queue sizes along with the MCs of these systems. Experimental results show that reducing 50 % on the number of entries, bandwidth and energy-per-bit levels are not affected, whilst reducing aggressively of about 90 %, bandwidth is similarly reduced while causing significantly higher energy-per-bit utilization.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. “LPDDR4 Moves Mobile”, mobile Forum (2013) Presented by Daniel Skinner.http://www.jedec.org/sites/.../D_Skinner_Mobile_Forum_May_2013_0.pdf. Accessed 06/03/2013

  2. JEDEC Publishes Breakthrough Standard for Wide I/O Mobile DRAM. http://www.jedec.org/. Accessed 02/03/2014

  3. Vantrease D et al (2008) Corona: system implications of emerging nanophotonic technology. In: ISCA. IEEE, DC, USA, pp 153–164

  4. Therdsteerasukdi Kea (2011) The dimm tree architecture: a high bandwidth and scalable memory system. In: ICCD. IEEE, pp 388–395. [Online]. http://dblp.uni-trier.de/db/conf/iccd/iccd2011.html#TherdsteerasukdiBIRCC11

  5. Marino MD (2013) RFiof: an RF approach to the I/O-pin and memory controller scalability for off-chip memories. In: CF, May 14–16. ACM, Ischia, Italy

  6. Li S et al (2009) McPAT: an integrated power, area, and timing modeling framework for multicore and manycore architectures. In: MICRO’09. ACM, New York, USA, pp 469–480

  7. Lee JK et al (2015) Guest editorial: embedded multicore systems andapplications. J Signal Process Syst 80(3):241–243

  8. Hybrid Memory Cube Specification 1.0. http://www.hybridmemorycube.org/. Accessed date 03/03/2014

  9. Marino MD (2012) RFiop: RF-memory path to address on-package I, O pad and memory controller scalability. In: ICCD, 2012. IEEE, Montreal, Quebec, Canada

  10. Liu Q (2007) Quilt packaging: a novel high speed chip-to-chip communication paradigm for system-in-package. Ph.D. Dissertation, Notre Dame, Indiana, USA, December 2007, Chair-Jacob, Bruce L

  11. McCalpin JD (1995) Memory bandwidth and machine balance in current high performance computers. In: IEEE TCCA Newsletter, pp 19–25

  12. The pChase memory benchmark page. http://pchase.org/. Accessed date 09/12/2012

  13. CACTI 5.1. http://www.hpl.hp.com/techreports/2008/HPL200820.html. Accessed date 04/16/2013

  14. Wang D et al (2005) DRAMsim: a memory system simulator. SIGARCH Comput Archit News 33(4):100–107

    Article  Google Scholar 

  15. Micron manufactures DRAM components and modules and NAND Flash. http://www.micron.com/. Accessed date 12/28/2012

  16. Binkert NL et al (2006) The M5 simulator: modeling networked systems. IEEE Micro 26(4):52–60

    Article  Google Scholar 

  17. Loh Gabriel H (2008) 3D-stacked memory architectures for multi-core processors. In: ISCA. IEEE, DC, USA, pp 453–464

  18. Frank Chang M et al (2008) CMP network-on-chip overlaid with multi-band RF-interconnect. In: HPCA. pp 191–202

  19. Chang MCF et al (2008) Power reduction of CMP communication networks via RF-interconnects. In: MICRO. IEEE, Washington, USA, pp 376–387

  20. Chang MCF et al (2005) Advanced RF/baseband interconnect schemes for inter- and intra-ULSI communications. IEEE Trans Electron Dev 52:1271–1285

    Article  Google Scholar 

  21. Marino MD (2012) On-package scalability of RF and inductive memory controllers. In: Euromicro DSD IEEE

  22. AMD Reveals Details About Bulldozer Microprocessors (2011). http://www.xbitlabs.com/news/cpu/display/20100824154814_AMD_Unveils_Details_About_Bulldozer_Microprocessors.html. Accessed date 08/02/2014

  23. David et al (2011) Memory power management via dynamic voltage/frequency scaling. In: Proceedings of the 8th ACM International Conference on Autonomic Computing, ser. ICAC ’11. ACM, New York, NY, USA, pp 31–40

  24. Tam SW et al (2011) RF-interconnect for future network-on-chip. Low Power Network-on-Chip, pp 255–280

  25. Byun G et al (2011) An 8.4 Gb/s 2.5 pJ/b mobile memory I/O interface using bi-directional and simultaneous dual (base+RF)-band signaling. In: ISSCC. IEEE, pp 488, 490

  26. ITRS HOME. http://www.itrs.net/. Accessed date 09/12/2012

  27. NAS parallel benchmarks. http://www.nas.nasa.gov/Resources/Software/npb.html/. Accessed date 03/11/2013

  28. Marino MD, Li KC (2014) Insights on memory controller scaling in multi-core embedded systems. Int J Embed Syst 6(4):351–361

  29. Deng Q et al (2011) Memscale: active low-power modes for main memory. In: Proceedings of the Sixteenth ASPLOS. ACM, New York, NY, USA, pp 225–238

  30. Malladi et al. (2012) Towards energy-proportional datacenter memory with mobile DRAM. In: Proceedings of the 39th Annual International Symposium on Computer Architecture, ser. ISCA ’12. IEEE Computer Society, Washington, DC, USA, pp 37–48

  31. Marowka A (2012) TBBench: a micro-benchmark suite for intel Threading building blocks. J Inf Proces Syst 8(2):331–346

    Article  Google Scholar 

  32. Ding JH et al (2014) An efficient and comprehensive scheduler on asymmetric multicore architecture systems. J Syst Architect Embed Syst Des 60(3):305–314

  33. Liu C, Granados O, Duarte R, Andrian J (2012) Energy efficient architecture using hardware acceleration for software defined radio components. J Inf Process Syst 8(1):133–144

  34. Bunse C, Choi Y, Gross HG (2012) Evaluation of an abstract component model for embedded systems development. J Inf Process Syst 8(4):539–554

    Article  Google Scholar 

Download references

Acknowledgments

Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the Ministry of Science and Technology (MOST, Taiwan), Providence University and Nvidia. This research is based upon work partially supported by Ministry of Science and Technology (MOST, Taiwan), Providence University and NVIDIA. We would like to thank Maria Amelia Guitti Marino and anonymous reviewers for their feedbacks and suggestions.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kuan-Ching Li.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Marino, M.D., Li, KC. Implications of shallower memory controller transaction queues in scalable memory systems. J Supercomput 72, 1785–1798 (2016). https://doi.org/10.1007/s11227-015-1485-x

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-015-1485-x

Keywords

Navigation