An Asymmetrical Register File: The VWR
This chapter introduces one of the core contributions of the book which helps improve the energy efficiency of the register file. It presents a novel register file/foreground memory organization which is motivated across application, architecture and physical design abstraction layers. It is fully compatible with the energy-efficient scratchpad memory organisation that is proposed to be used for the large data storage in the previous chapter. It motivates across these different abstraction layers why the proposed register file is more efficient. It also shows that the proposed architecture is energy efficient over different DSP benchmarks.
KeywordsRegister File Memory Organization Data Level Parallelism Scratchpad Memory Wide Memory
Unable to display preview. Download preview PDF.
- Rix00a.S. Rixner, W. Dally, B. Khialany, P.Mattson, U.Kapasi, and J.Owens. Register organization for media processing. Proc. of 26th Intnl. Symposium on High-Performance Computer Architecture (HiPC), pages 375–386, January 2000.Google Scholar
- Jan95.J.Janssen and H.Corporaal. Partitioned register file for TTAs. Proc. of Micro, pages 303–312, 1995.Google Scholar
- Wil95.P. Wilson, M. Johnstone, M.Neely, and D.Boles. Dynamic storage allocation: A survey and critical review. 1995 Intnl. Workshop on Memory Management, pages 1–116. Springer-Verlag, 1995.Google Scholar
- DeM05.H.DeMan. Ambient intelligence: Giga-scale dreams and nano-scale realities. Proc. of ISSCC, Keynote Speech, February 2005.Google Scholar
- Jos06.M.Joshi, NS. Nagaraj, and A.Hill. Impact of interconnect scaling and process variations on performance. Proc. of CMOS Emerging Technologies, 2006.Google Scholar
- Syl99.D.Sylvester and K.Keutzer. Getting to the bottom of deep submicron ii: a global wiring paradigm. ISPD ’99: Proceedings of the 1999 international symposium on Physical design, pages 193–200, New York, NY, USA, 1999. ACM.Google Scholar
- ITR07.ITRS. Intnl. techology roadmap for semiconductors 2007 edition: Interconnect. Technical report, ITRS, http://www.itrs.net/Links/2007ITRS/2007∖_Chapters/2007∖_Interconnect.pdf, 2007.
- Eva95.P.Evans, R.Franzon. Energy consumption modeling and optimization for SRAM’s. IEEE Journal of Solid-State Circuits, Vol.30, pages 571–579, May 1995.Google Scholar
- Rag09b.P.Raghavan, “Low energy VLIW architecture extensions and compiler plug-ins for embedded systems”, Doctoral dissertation, ESAT/EE Dept., K.U.Leuven, Belgium, June 2009.Google Scholar
- Amr00.B.Amrutur and M.Horowitz. Speed and power scaling of SRAM’s. IEEE Journal of Solid-State Circuits, Vol.35, February 2000.Google Scholar
- Dom05.J.Domelevo. Working on the design of a customizable ultra-low power processor: A few experiments. Master’s thesis, ENS Cachan Bretage and IMEC, Sep 2005.Google Scholar
- San01.J.Sánchez and A.González. Modulo scheduling for a fully-distributed clustered VLIW architectures. Proc. of 29th Intnl. Symposium on Microarchitecture (MICRO), December 2001.Google Scholar
- Zal00a.J.Zalamea, J.Llosa, E.Ayguade, and M.Valero. Two-level hierarchical register file organization for vliw processors. Microarchitecture, 2000. MICRO-33. Proceedings. 33rd Annual IEEE/ACM Intnl. Symposium on, pages 137–146, 2000.Google Scholar
- Jay04.N.Jayasena, M.Erez, J.Anh, and W.Dally. Stream register files with indexed access. Proc. of HPCA, pages 60–72, February 2004.Google Scholar
- Der06.J.Derby, R.Montoye, and J.Moreira. Victoria - vmx indirect compute technology oriented towareds in-line acceleration. Proc. of CF, pages 303–311, May 2006.Google Scholar
- TI00.Texas Instruments, Inc, http://www.ti.com. TMS320C6000 CPU and Instruction Set Reference Guide, October 2000.
- Rab04.R.Rabbah, I.Bratt, K.Asanovic, and A.Agarwal. Versatility and versabench: A new metric and a benchmark suite for flexible architectures. June 2004.Google Scholar