Design exploration of a NVM based hybrid instruction memory organization for embedded platforms


Non volatile memory (NVM) technologies are being explored extensively to replace conventional SRAM based memories. The main focus of this paper is the exploration of a NVM based instruction memory in low power embedded systems for wireless or multimedia target applications. A SRAM based traditional instruction memory organization suitable for the target applications is taken as the base. Different Resistive RAM (ReRAM) based organizations are then designed as alternatives keeping in mind their limitations (write process related), and energy and performance trade-offs. The NVM array design is explored and optimized based on energy and performance trade-offs. Dynamic instruction mapping and architectural design changes are utilized to minimize ReRAM limitations and maximize its positive contributions. Energy and performance values are obtained by extension of CACTI models, Spice and VHDL simulations. The best ReRAM based hybrid instruction memory organization that utilizes our proposed methodology showed significantly lower energy consumption (up-to 82.07 % read energy reduction) even in case of 0 % performance penalty.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17


  1. 1.

    Texas Instruments (2010) TMS320C64x/C64x+ DSP CPU and Instruction Set : Reference guide

  2. 2.

    Burr G, Kurdi B, Scott J, Lam C, Gopalakrishnan K, Shenoy R (2008) Overview of candidate device technologies for storage-class memory. IBM J Res Dev 52(4.5):449–464. doi:10.1147/rd.524.0449

    Article  Google Scholar 

  3. 3.

    Chen YT, Cong J, Huang H, Liu B, Liu C, Potkonjak M, Reinman G (2012) Dynamically reconfigurable hybrid cache: an energy-efficient last-level cache design. In: Design, automation test in Europe conference exhibition (DATE), pp 45–50. doi:10.1109/DATE.2012.6176431

  4. 4.

    Mangalagiri P, Sarpatwari K, Yanamandra A, Narayanan V, Xie Y, Irwin MJ, Karim OA (2008) In: Proceedings of the 18th ACM Great Lakes symposium on VLSI (ACM, 2008), GLSVLSI ’08, pp 395–398. doi:10.1145/1366110.1366204

  5. 5.

    Smullen CW, Mohan V, Nigam A, Gurumurthi S, Stan MR (2011) In: IEEE 17th international symposium on high performance computer architecture (HPCA), 2011, University of Virginia, pp 50–61

  6. 6.

    Sun H, Liu C, Xu W, Zhao J, Zheng N, Zhang T (2012) Using magnetic RAM to build low-power and soft error-resilient L1 cache. IEEE Trans Very Large Scale Integr Syst 20(1):19. doi:10.1109/TVLSI.2010.2090914

    Article  Google Scholar 

  7. 7.

    Hu J, Xue C, Zhuge Q, Tseng WC, Sha EM (2011) Towards energy efficient hybrid on-chip Scratch Pad Memory with non-volatile memory. In: Design, automation test in Europe conference exhibition, pp 1–6. doi:10.1109/DATE.2011.5763127

  8. 8.

    Apalkov D, Khvalkovskiy A, Watts S, Nikitin V, Tang X, Lottis D, Moon K, Luo X, Chen E, Ong A, Driskill-Smith A, Krounbi M (2013) Spin-transfer torque magnetic random access memory (STT-MRAM). J Emerg Technol Comput Syst 9(2):13:1–13:35

    Article  Google Scholar 

  9. 9.

    Kawahara A, Kawai K, Ikeda Y, Katoh Y, Azuma R, Yoshimoto Y, Tanabe K, Wei Z, Ninomiya T, Katayama K, Yasuhara R, Muraoka S, Himeno A, Yoshikawa N, Murase H, Shimakawa K, Takagi T, Mikawa T, Aono K (2013) In: Solid-State Circuits Conference Digest of Technical Papers (ISSCC), 2013 IEEE International, pp 220–221. doi: 10.1109/ISSCC.2013.6487708

  10. 10.

    Cheng C, Chin A (2013) Nano-crystallized titanium oxide resistive memory with uniform switching and long endurance. Appl Phys A 111(1):203. doi:10.1007/s00339-013-7547-0

    Article  Google Scholar 

  11. 11.

    Raghavan P, Lambrechts A, Jayapala M, Catthoor F, Verkest D, Corporaal H (2007) In : Design, automation and test in Europe conference and exhibition (DATE), (IMEC, 2007), pp 1–6

  12. 12.

    Kin J, Gupta M, Mangione-Smith WH (1997) In: Proceedings of the 30th annual ACM/IEEE international symposium on microarchitecture (IEEE Computer Society, Washington, DC, USA, 1997), MICRO 30, pp 184–193.

  13. 13.

    Monazzah A, Farbeh H, Miremadi S, Fazeli M, Asadi H (2013) In: 43rd Annual IEEE/IFIP international conference on dependable systems and networks (DSN), 2013, pp 1–10. doi:10.1109/DSN.2013.6575351

  14. 14.

    Li Q, Zhao Y, Hu J, Xue C, Sha E, He Y (2012) In: 16th Workshop on interaction between compilers and computer architectures (INTERACT), 2012, pp 17–24. doi:10.1109/INTERACT.2012.6339622

  15. 15.

    Hu J, Zhuge Q, Xue C, Tseng WC, Sha EM (2012) In: IEEE 26th international parallel and distributed processing symposium workshops PhD forum (IPDPSW), 2012, pp 982–989. doi:10.1109/IPDPSW.2012.120

  16. 16.

    Hu J, Xue C, Zhuge Q, Tseng WC, Sha E (2013) Data allocation optimization for hybrid scratch pad memory with SRAM and nonvolatile memory. IEEE Trans Very Large Scale Integr (VLSI) Syst 21(6):1094–1102. doi:10.1109/TVLSI.2012.2202700

    Article  Google Scholar 

  17. 17.

    Hu J, Zhuge Q, Xue CJ, Tseng WC, Sha EHM (2014) Management and optimization for nonvolatile memory-based hybrid scratchpad memory on multicore embedded processors. ACM Trans Embed Comput Syst 13(4):79:1–79:25. doi:10.1145/2560019

    Article  Google Scholar 

  18. 18.

    Wang P, Sun G, Wang T, Xie Y, Cong J (2013) In: IEEE international symposium on circuits and systems, 2013, pp 1244–1247. doi:10.1109/ISCAS.2013.6572078

  19. 19.

    Cosemans S, Dehaene W, Catthoor F (2009) A 3.6 pJ/access 480 MHz, 128 kb on-chip SRAM with 850 mHz boost mode in 90 nm cmos with tunable sense amplifiers. IEEE J Solid-State Circuits 44(7):2065–2077

    Article  Google Scholar 

  20. 20.

    Sarpeshkar R, Delbruck T, Mead CA (1993) White noise in MOS transistors and resistors. IEEE Circuits Devices Mag 9(6):23–29

    Article  Google Scholar 

  21. 21.

    Karandikar A, Parhi KK (1998) In: Proceedings international conference on computer design: VLSI in Computers and Processors, ICCD ’98. (Intel Corp., 1998), pp 82–88

  22. 22.

    Uh GR, Wang Y, Whalley D, Jinturkar S, Burns C, Cao V (1999) In: Workshop on languages, compilers, and tools for embedded systems (Lucent Technologies, 1999), pp 10–19

  23. 23.

    Bajwa R, Hiraki M, Kojima H, Gorny D, Nitta K, Shridhar A, Seki K, Sasaki K (1997) Instruction buffering to reduce power in processors for signal processing. IEEE Trans Very Large Scale Integr (VLSI) Syst 5(4):417–424

    Article  Google Scholar 

  24. 24.

    Bellas N, Hajj I, Polychronopoulos C, Stamoulis G (2000) Architectural and compiler techniques for energy reduction in high-performance microprocessors. IEEE Trans Very Large Scale Integr (VLSI) Syst 8(3):317–326

    Article  Google Scholar 

  25. 25.

    Jayapala M, Barat F, Aa T, Catthoor F, Corporaal H, Deconinck G (2005) Clustered loop buffer organization for low energy VLIW embedded processors. IEEE Trans Comput 54(6):672–683

    Article  Google Scholar 

  26. 26.

    Raghavan P, Lambrechts A, Jayapala M, Catthoor F, Verkest D (2009) Distributed loop controller for multithreading in unithreaded ILP architectures. IEEE Trans Comput 58(3):311–321

    Article  MathSciNet  Google Scholar 

  27. 27.

    Artes A, Ayala J, Sathanur A, Huisken J, Catthoor F (2011) In: 19th International conference on VLSI and system-on-chip, VLSI-SoC 2011, pp 136–141

  28. 28.

    Govoreanu B, Kar G, Chen Y, Paraschiv V, Kubicek S, Fantini A, Radu IP, Goux L, Clima S, Degraeve R, Jossart N, Richard O, Vandeweyer T, Seo K, Hendrickx P, Pourtois G, Bender H, Altimime L, Wouters D, Kittl J, Jurczak M (2011) In: IEEE international electron devices meeting technical digest (IEDM), 2011 (imec, 2011), pp 729–732

  29. 29.

    Sheu SS, Cheng KH, Chang MF, Chiang PC, Lin WP, Lee HY, Chen PS, Chen YS, Wu TY, Chen F, Su KL, Kao MJ, Tsai MJ (2011) Fast-write resistive RAM (RRAM) for embedded applications. IEEE Des Test Comput 28(1):64–71

    Article  Google Scholar 

  30. 30.

    Dong X, Xu C, Xie Y, Jouppi N (2012) NVSim: a circuit-level performance, energy, and area model for emerging nonvolatile memory. IEEE Trans Comput-Aided Des Integr Circuits Syst 31(7):994–1007

    Article  Google Scholar 

  31. 31.

    Muralimanohar N, Balasubramonian R (2009) Cacti 6.0: a tool to model large caches

  32. 32.

    Graphics M (2011) Modelsim se 10.0c.

Download references

Author information



Corresponding author

Correspondence to Manu Perumkunnil Komalan.

Additional information

This is a submission for the Special Issue on Memory Architecture and Organization for Embedded Systems. This project was partially funded by the Spanish government’s research contract: TIN 2012-32180.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Komalan, M.P., Pérez, J.I.G., Tenllado, C. et al. Design exploration of a NVM based hybrid instruction memory organization for embedded platforms. Des Autom Embed Syst 17, 459–483 (2013).

Download citation


  • Non volatile memory
  • Instruction memory organization
  • Loop buffer
  • Very wide register