HMC and DDR Performance Trade-offs

  • Paulo C. Santos
  • Marco A. Z. Alves
  • Luigi Carro
Conference paper
Part of the IFIP Advances in Information and Communication Technology book series (IFIPAICT, volume 523)


The evolution of main memories, from SDR to the current DDR, presents multiple technological breakthroughs, but still far from the requirements of the processors. With the advent of Hybrid Memory Cube (HMC), a promise of high bandwidth with low energy consumption and less area may provide better efficiency than the traditional DDR modules. This is especially attractive for embedded systems. In this paper, we perform a comprehensive performance comparison between HMC and DDR memories, to understand the capabilities and limitations of both. Simulation results running SPEC-CPU2006 and SPEC-OMP2001 benchmarks show that applications with low memory pressure behave similarly with HMC or DDR. We make the new observation that HMC performs better than DDR specially for applications with a high memory pressure and low spatial data locality. However, for applications with a streaming behavior, commonly present in the embedded system domain, our experiments show that current HMC row-buffer specifications do not take advantage of the spatial locality present in those applications.


HMC DDR Main memory Performance evaluation 


  1. 1.
    Altera: Hybrid memory cube controller IP core user guide (2015).
  2. 2.
    Alves, M.A.Z.: Increasing energy efficiency of processor caches via line usage predictors. Ph.D. thesis, Universidade Federal do Rio Grande do Sul (2014)Google Scholar
  3. 3.
    Alves, M.A.Z., Diener, M., Moreira, F.B., et al.: SiNUCA: a validated micro-architecture simulator. In: High Performance Computation Conference (2015)Google Scholar
  4. 4.
    Hybrid Memory Cube Consortium: Hybrid memory cube specification rev. 1.0 (2011).
  5. 5.
    Hybrid Memory Cube Consortium: Hybrid memory cube specification rev. 2.0 (2013).
  6. 6.
    Davis, B.T.: Modern DRAM architectures. Ph.D. thesis, University of Michigan (2001)Google Scholar
  7. 7.
    Henning, J.L.: SPEC CPU2006 benchmark descriptions. ACM SIGARCH Comput. Archit. News 34(4), 1–17 (2006)CrossRefGoogle Scholar
  8. 8.
    Intel: Intel Atom Processor E3800 Product Family. Technical report (2015)Google Scholar
  9. 9.
    Jacob, B., Ng, S., Wang, D.: Memory Systems: Cache, DRAM, Disk. Morgan Kaufmann, Burlington (2008)Google Scholar
  10. 10.
    Jeddeloh, J., Keeth, B.: Hybrid memory cube new DRAM architecture increases density and performance. In: Symposium on VLSI Technology, pp. 87–88, June 2012Google Scholar
  11. 11.
    Leidel, J., Chen, Y.: HMC-sim: a simulation framework for hybrid memory cube devices. In: International Parallel Distributed Processing Symposium Workshops, pp. 1465–1474, May 2014Google Scholar
  12. 12.
    Micron: 1Gb: x4, x8, x16 DDR3 SDRAM features, \(1{\rm Gb}_{\rm DDR}3_{\rm SDRAM}\) - Rev. N 11/14 EN (2006)Google Scholar
  13. 13.
    Olmen, J.V., Mercha, A., Katti, G., et al.: 3D stacked IC demonstration using a through silicon via first approach. In: International Electronic Devices Meeting (2008)Google Scholar
  14. 14.
    Patil, H., Cohn, R., Charney, M., et al.: Pinpointing representative portions of large Intel Itanium programs with dynamic instrumentation. In: International Symposium on Microarchitecture, pp. 81–92, December 2004Google Scholar
  15. 15.
    Pawlowski, J.: Hybrid memory cube (HMC). In: Hot Chips 23 (2011)Google Scholar
  16. 16.
    Rosenfeld, P.: Performance exploration of the hybrid memory cube. Ph.D. thesis, University of Maryland (2014)Google Scholar
  17. 17.
    Rosenfeld, P., Cooper-Balis, E., Farrell, T., Resnick, D., Jacob, B.: Peering over the memory wall: design space and performance analysis of the hybrid memory cube. Technical report UMD-SCA-2012-10-01, University of Maryland (2012)Google Scholar
  18. 18.
    Saito, H., Gaertner, G., Jones, W., et al.: Large system performance of SPEC OMP2001 benchmarks. In: International Symposium on High Performance Computing, pp. 370–379 (2006)Google Scholar
  19. 19.
    Thanh-Hoang, T., Shambayati, A., Deutschbein, C., Hoffmann, H., Chien, A.: Performance and energy limits of a processor-integrated FFT accelerator. In: High Performance Extreme Computing Conference, pp. 1–6, September 2014Google Scholar
  20. 20.
    Yoshida, T., Hondou, M., Tabata, T., et al.: SPARC64 XIfx: Fujitsu’s next generation processor for HPC. IEEE Micro 35(2), 6–14 (2015)CrossRefGoogle Scholar
  21. 21.
    Zhu, Z., Zhang, Z., Zhang, X.: Fine-grain priority scheduling on multi-channel memory systems. In: International Symposium on High-Performance Computer Architecture, pp. 107–116, February 2002Google Scholar

Copyright information

© IFIP International Federation for Information Processing 2017

Authors and Affiliations

  1. 1.Informatics Institute - Federal University of Rio Grande Do SulPorto AlegreBrazil

Personalised recommendations