An Asymmetrical Register File: The VWR

  • Francky Catthoor
  • Praveen Raghavan
  • Andy Lambrechts
  • Murali Jayapala
  • Angeliki Kritikakou
  • Javed Absar


This chapter introduces one of the core contributions of the book which helps improve the energy efficiency of the register file. It presents a novel register file/foreground memory organization which is motivated across application, architecture and physical design abstraction layers. It is fully compatible with the energy-efficient scratchpad memory organisation that is proposed to be used for the large data storage in the previous chapter. It motivates across these different abstraction layers why the proposed register file is more efficient. It also shows that the proposed architecture is energy efficient over different DSP benchmarks.


Register File Memory Organization Data Level Parallelism Scratchpad Memory Wide Memory 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Rix00a.
    S. Rixner, W. Dally, B. Khialany, P.Mattson, U.Kapasi, and J.Owens. Register organization for media processing. Proc. of 26th Intnl. Symposium on High-Performance Computer Architecture (HiPC), pages 375–386, January 2000.Google Scholar
  2. Jan95.
    J.Janssen and H.Corporaal. Partitioned register file for TTAs. Proc. of Micro, pages 303–312, 1995.Google Scholar
  3. Wil95.
    P. Wilson, M. Johnstone, M.Neely, and D.Boles. Dynamic storage allocation: A survey and critical review. 1995 Intnl. Workshop on Memory Management, pages 1–116. Springer-Verlag, 1995.Google Scholar
  4. DeM05.
    H.DeMan. Ambient intelligence: Giga-scale dreams and nano-scale realities. Proc. of ISSCC, Keynote Speech, February 2005.Google Scholar
  5. Jos06.
    M.Joshi, NS. Nagaraj, and A.Hill. Impact of interconnect scaling and process variations on performance. Proc. of CMOS Emerging Technologies, 2006.Google Scholar
  6. Syl99.
    D.Sylvester and K.Keutzer. Getting to the bottom of deep submicron ii: a global wiring paradigm. ISPD ’99: Proceedings of the 1999 international symposium on Physical design, pages 193–200, New York, NY, USA, 1999. ACM.Google Scholar
  7. ITR07.
    ITRS. Intnl. techology roadmap for semiconductors 2007 edition: Interconnect. Technical report, ITRS,∖_Chapters/2007∖_Interconnect.pdf, 2007.
  8. Eva95.
    P.Evans, R.Franzon. Energy consumption modeling and optimization for SRAM’s. IEEE Journal of Solid-State Circuits, Vol.30, pages 571–579, May 1995.Google Scholar
  9. Rag09b.
    P.Raghavan, “Low energy VLIW architecture extensions and compiler plug-ins for embedded systems”, Doctoral dissertation, ESAT/EE Dept., K.U.Leuven, Belgium, June 2009.Google Scholar
  10. Amr00.
    B.Amrutur and M.Horowitz. Speed and power scaling of SRAM’s. IEEE Journal of Solid-State Circuits, Vol.35, February 2000.Google Scholar
  11. Dom05.
    J.Domelevo. Working on the design of a customizable ultra-low power processor: A few experiments. Master’s thesis, ENS Cachan Bretage and IMEC, Sep 2005.Google Scholar
  12. San01.
    J.Sánchez and A.González. Modulo scheduling for a fully-distributed clustered VLIW architectures. Proc. of 29th Intnl. Symposium on Microarchitecture (MICRO), December 2001.Google Scholar
  13. Lap02.
    V. Lapinskii, M. Jacome, and G. de Veciana. Application-specific clustered vliw datapaths: Early exploration on a parameterized design space. IEEE Trans. on Computer Aided Design of Integrated Circuits and Systems, 21(8): 889–903, August 2002.CrossRefGoogle Scholar
  14. Zal00a.
    J.Zalamea, J.Llosa, E.Ayguade, and M.Valero. Two-level hierarchical register file organization for vliw processors. Microarchitecture, 2000. MICRO-33. Proceedings. 33rd Annual IEEE/ACM Intnl. Symposium on, pages 137–146, 2000.Google Scholar
  15. Jay04.
    N.Jayasena, M.Erez, J.Anh, and W.Dally. Stream register files with indexed access. Proc. of HPCA, pages 60–72, February 2004.Google Scholar
  16. Der06.
    J.Derby, R.Montoye, and J.Moreira. Victoria - vmx indirect compute technology oriented towareds in-line acceleration. Proc. of CF, pages 303–311, May 2006.Google Scholar
  17. TI00.
    Texas Instruments, Inc, TMS320C6000 CPU and Instruction Set Reference Guide, October 2000.
  18. Rab04.
    R.Rabbah, I.Bratt, K.Asanovic, and A.Agarwal. Versatility and versabench: A new metric and a benchmark suite for flexible architectures. June 2004.Google Scholar

Copyright information

© Springer Science+Business Media B.V. 2010

Authors and Affiliations

  • Francky Catthoor
    • 1
  • Praveen Raghavan
    • 1
  • Andy Lambrechts
    • 1
  • Murali Jayapala
    • 1
  • Angeliki Kritikakou
    • 2
  • Javed Absar
    • 3
  1. 1.Interuniversity MicroElectronics Center IMECLeuvenBelgium
  2. 2.VLSI Design LabUniv. PatrasPatrasGreece
  3. 3.Samsung India Software Operations Pvt. LtdBangaloreIndia

Personalised recommendations