Instruction Fetch Energy Reduction with Biased SRAMs

Multanen, Joonas; Viitanen, Timo; Jääskeläinen, Pekka; Takala, Jarmo

doi:10.1007/s11265-018-1367-6

Instruction Fetch Energy Reduction with Biased SRAMs

Published: 26 April 2018

Volume 90, pages 1519–1532, (2018)
Cite this article

Journal of Signal Processing Systems Aims and scope Submit manuscript

Joonas Multanen ORCID: orcid.org/0000-0003-4438-2031¹,
Timo Viitanen¹,
Pekka Jääskeläinen¹ &
…
Jarmo Takala¹

227 Accesses
2 Altmetric
Explore all metrics

Abstract

Especially in programmable processors, energy consumption of integrated memories can become a limiting design factor due to thermal dissipation power constraints and limited battery capacity. Consequently, contemporary improvement efforts on memory technologies are focusing more on the energy-efficiency aspects, which has resulted in biased CMOS SRAM cells that increase energy efficiency by favoring one logical value over another. In this paper, xor-masking, a method for exploiting such contemporary low power SRAM memories is proposed to improve the energy-efficiency of instruction fetching. Xor-masking utilizes static program analysis statistics to produce optimal encoding masks to reduce the occurrence of the more energy consuming instruction bit values in the fetched instructions. The method is evaluated on LatticeMico32, a small RISC core popular in ultra low power designs, and on a wide instruction word high performance low power DSP. Compared to the previous “bus invert” technique typically used with similar SRAMs, the proposed method reduces instruction read energy consumption of the LatticeMico32 by up to 13% and 38% on the DSP core.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Disrupting Low-Write-Energy vs. Fast-Read Dilemma in RRAM to Enable L1 Instruction Cache

A Low-Energy Wide SIMD Architecture with Explicit Datapath

Article 16 September 2014

Luc Waeijen, Dongrui She, … Yifan He

Enabling energy-proportional computing on instruction-level parallel processors

Article 05 October 2014

Yung-Cheng Ma, Wen-Shih Chao & Tse-An Liu

References

Atzori, L, Iera, A, Morabito, G. (2010). The internet of things: a survey. Computer Networks, 54(15), 2787–2805.
Article Google Scholar
Taylor, M. (2012). Is dark silicon useful?: harnessing the four horsemen of the coming dark silicon apocalypse. In Proceedings of the 49th annual design automation conference.
Bol, D., De Vos, J., Hocquet, C., Botman, F., Durvaux, F., Boyd, S., Flandre, D., Legat, J. (2013). SleepWalker: a 25-MHz 0.4-V Sub-mm² 7- μ m ² μ W/MHz microcontroller in 65-nm LP/GP CMOS for low-carbon wireless sensor nodes. IEEE Journal of Solid-State Circuits, 48(1), 20–32.
Article Google Scholar
Carroll, A., & Heiser, G. (2010). An analysis of power consumption in a smartphone. In Proceedings of the USENIX annual technical conference. Boston.
Fong, X., Kim, Y., Yogendra, K., Fan, D., Sengupta, A., Raghunathan, A., Roy, K. (2016). Spin-transfer torque devices for logic and memory: prospects and perspectives. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 35(1), 1–22.
Article Google Scholar
Hu, J., Xue, C. J., Zhuge, Q., Tseng, W. C., Sha, E. H. -M. (2011). Towards energy efficient hybrid on-chip scratch pad memory with non-volatile memory. In Design, automation test in europe conference exhibition.
Benini, L., Macii, A., Poncino, M. (2003). Energy-aware design of embedded memories: a survey of technologies, architectures, and optimization techniques. Transactions on Embedded Computing Systems, 2(1), 5–32.
Article Google Scholar
ISSCC. (2016). ISSCC 2016 tech trends. http://isscc.org.
Azizi, N., & Najm, F. N. (2004). An asymmetric SRAM cell to lower gate leakage. In Proceedings of the 5th international symposium on quality electronic design. Hangzhou.
Imani, M., Patil, S., Rosing, T. S. (2015). Hierarchical design of robust and low data dependent FinFET based SRAM array. In Proceedings of the international symposium on nanoscale architectures. Boston.
Mori, H., Nakagawa, T., Kitahara, Y., Kawamoto, Y., Takagi, K., Yoshimoto, S., Izumi, S., Nii, K., Kawaguchi, H., Yoshimoto, M. (2015). A 298-fJ/writecycle 650-fJ/readcycle 8T three-port SRAM in 28-nm FD-SOI process technology for image processor. In Proceedings of the IEEE custom integrated circuits conference. San Jose.
Teman, A., Mordakhay, A., Mezhibovsky, J., Fish, A. (2012). A 40-nm sub-threshold 5T SRAM bit cell with improved read and write stability. IEEE Transactions on Circuits and Systems II: Express Briefs, 59(12), 873–877.
Article Google Scholar
Young, K. K. (1989). Short-channel effect in fully depleted soi mosfets. IEEE Transactions on Electron Devices, 36(2), 399–402.
Article Google Scholar
Multanen, J., Viitanen, T., Jääskeläinen, P., Takala, J. (2016). Xor-masking: a novel statistical method for instruction read energy reduction in contemporary SRAM technologies. In International workshop on signal processing systems. Dallas.
Stan, M. R., & Burleson, W. P. (1995). Bus-invert coding for low-power I/O. IEEE Transactions on Very Large Scale Integration Systems, 3(1), 49–58.
Article Google Scholar
Shin, Y., Chae, S. -I., Choi, K. (2001). Partial bus-invert coding for power optimization of application-specific systems. IEEE Transactions on Very Large Scale Integration Systems, 9(2), 377–383.
Article Google Scholar
Ji, G., & Hui, G. (2009). A segmental bus-invert coding method for instruction memory data bus power efficiency. In Proceedings of the IEEE international symposium on circuits and systems. Taipei.
Petrov, P., & Orailoglu, A. (2003). Application-specific instruction memory customizations for power-efficient embedded processors. IEEE Design Test of Computers, 20(1), 18–25.
Article Google Scholar
Su, C., Tsui, C., Despain, A. (1994). Saving power in the control path of embedded processors. IEEE Design and Test of Computers, 11(4), 24–31.
Article Google Scholar
Musoll, E., Lang, T., Cortadella, J. (1998). Working-zone encoding for reducing the energy in microprocessor address buses. IEEE Transactions on Very Large Scale Integration Systems, 6(4).
Article Google Scholar
Benini, L., De Micheli, G., Macii, E., Poncino, M., Quez, S. (1997). System-level power optimization of special purpose applications: the beach solution. In Proceedings of the international symposium on low power electronics and design. Monterey.
Yang, J., Gupta, R., Zhang, C. (2004). Frequent value encoding for low power data buses. ACM Transactions on Design Automation of Electronic Systems, 9(3), 354–384.
Article Google Scholar
Hennessy, J., & Patterson, D. (2002). Computer architecture: a quantitative approach, 3rd edn. San Francisco: Morgan Kaufmann Publishers Inc.,.
MATH Google Scholar
Parhami, B. (1991). Design of m-out-of-n bit-voters. In Conference record of the twenty-fifth asilomar conference on signals, systems and computers (Vol. 2). Pacific Grove.
Suresh, D. C., Najjar, W. A., Vahid, F., Villarreal, J. R., Stitt, G. (2003). Profiling tools for hardware/software partitioning of embedded applications. SIGPLAN Notices, 38(7), 189–198.
Article Google Scholar
Lattice Semiconductor. Latticemico32 (2016). http://www.latticesemi.com/en/Products/DesignSoftwareAndIP/IntellectualProperty/IPCore/IPCores02/LatticeMico32.aspx.
Ben Salem, Z., Youssef, M. W., Abid, M. (2010). Prototyping cost-effective secure application server on a chip (sasoc) a case study for monitoring sensor network. In International conference on wireless and ubiquitous systems Sousse.
Schleuniger, P., McKee, S., Karlsson, S. (2012). Design principles for synthesizable processor cores.
Chapter Google Scholar
Multanen, J., Kultala, H., Koskela, M., Viitanen, T., Jääskeläinen, P., Takala, J., Danielyan, A., Cruz, C. (2016). Opencl programmable exposed datapath high performance low-power image signal processor. In IEEE Nordic circuits and systems conference.
Esko, O., Jääskeläinen, P., Huerta, P., de La Lama, C. S., Takala, J., Martinez, J. I. (2010). Customized exposed datapath soft-core design flow with compiler support. In Proceedings of international conference on field programmable logic and applications. Washington, DC.
Siti, M., & Fitz, M. P. (2006). A novel soft-output layered orthogonal lattice detector for multiple antenna communications. In International conference on communications (Vol. 4). Istanbul.
Hara, Y., Tomiyama, H., Honda, S., Takada, H. (2009). Proposal and quantitative analysis of the CHStone benchmark program suite for practical C-based high-level synthesis. Journal of Information Processing, 17, 242–254.
Article Google Scholar
Zivojnovic, V., Martinez, J., Schlger, C., Meyr, H. (1994). DSPstone: a DSP-oriented benchmarking methodology. In Proceedings of the international conference on signal processing applications and technology. Dallas.
EEMBC –. (2016). The embedded microprocessor benchmark consortium. Coremark benchmark. http://www.eembc.org/coremark.
Wilhelm, R, Engblom, J, Ermedahl, A, Holsti, N, Thesing, S, Whalley, D, Bernat, G, Ferdinand, C, Heckmann, R, Mitra, T, Mueller, F, Puaut, I, Puschner, P, Staschulat, J, Stenström, P. (2008). The worst-case execution-time problem - overview of methods and survey of tools. ACM Transactions on Embedded Computing Systems, 7(3), 1–53.
Article Google Scholar

Download references

Acknowledgments

The authors would like to thank the TUT Graduate School, Academy of Finland (project PLC), Finnish Funding Agency for Technology and Innovation (project ”Parallel Acceleration 3”, funding decision 1134/31/2015), and ARTEMIS JU under grant agreement no 621439 (ALMARVI).

Author information

Authors and Affiliations

Customized Parallel Computing Group, Laboratory of Pervasive Computing, Tampere University of Technology, Tampere, Finland
Joonas Multanen, Timo Viitanen, Pekka Jääskeläinen & Jarmo Takala

Authors

Joonas Multanen
View author publications
You can also search for this author in PubMed Google Scholar
Timo Viitanen
View author publications
You can also search for this author in PubMed Google Scholar
Pekka Jääskeläinen
View author publications
You can also search for this author in PubMed Google Scholar
Jarmo Takala
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Joonas Multanen.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Multanen, J., Viitanen, T., Jääskeläinen, P. et al. Instruction Fetch Energy Reduction with Biased SRAMs. J Sign Process Syst 90, 1519–1532 (2018). https://doi.org/10.1007/s11265-018-1367-6

Download citation

Received: 13 January 2017
Revised: 17 October 2017
Accepted: 12 April 2018
Published: 26 April 2018
Issue Date: November 2018
DOI: https://doi.org/10.1007/s11265-018-1367-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Instruction Fetch Energy Reduction with Biased SRAMs

Abstract

Access this article

Similar content being viewed by others

Disrupting Low-Write-Energy vs. Fast-Read Dilemma in RRAM to Enable L1 Instruction Cache

A Low-Energy Wide SIMD Architecture with Explicit Datapath

Enabling energy-proportional computing on instruction-level parallel processors

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Instruction Fetch Energy Reduction with Biased SRAMs

Abstract

Access this article

Similar content being viewed by others

Disrupting Low-Write-Energy vs. Fast-Read Dilemma in RRAM to Enable L1 Instruction Cache

A Low-Energy Wide SIMD Architecture with Explicit Datapath

Enabling energy-proportional computing on instruction-level parallel processors

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation