Design Automation for Embedded Systems

, Volume 14, Issue 3, pp 265–284 | Cite as

Squashing code size in microcoded IPs while delivering high decompression speed

  • Chengmo YangEmail author
  • Mingjing Chen
  • Alex Orailoglu
Open Access


Microcoded customized IPs offer superior performance and direct programmability of micro-architectural structures compared to instruction-based processors, yet at the cost of drastically enlarged code sizes. Code compression can deliver size reductions but necessitates attention to performance issues, so that the performance benefits of microcoded IPs are not squandered in the process. To attain this goal, we propose in this paper a fast code compression technique through exploiting the fact that the microcodes contain a sizable amount of unspecified bits. Although the values and the positions of the specified bits are highly irregular, the proposed technique can still flexibly and precisely fill in these fully specified bits through utilizing a linear network. The linear property inherent in the compression strategy in turn enables the development of an extremely low-overhead decompression engine. At runtime, the decompressed code can be generated in such a way that all the specified bits can be filled as required by a fixed-bandwidth XOR network. The combination of the proposed flexible XOR-based network with a minimum two-level storage for highly specified fields, such as immediate values, offers utmost code compression, attained within a negligible amount of performance and hardware overhead.


Microcode compression Linear compression network Microcoded processors 


  1. 1.
    Schreiber R, Aditya S, Mahlke S, Kathail V, Rau BR, Cronquist D, Sivaraman M (2002) PICO-NPA: High-level synthesis of nonprogrammable hardware accelerators. VLSI Signal Process 31(2):127–142 zbMATHCrossRefGoogle Scholar
  2. 2.
    Clark N, Zhong H, Fan K, Mahlke S, Flautner K, Nieuwenhove KV (2004) OptimoDE: Programmable accelerator engines through retargetable customization. In: Hot Chips Google Scholar
  3. 3.
    Weber S, Keutzer K (2005) Using minimal minterms to represent programmability. In: CODES+ISSS, Sept 2005, pp 63–68 Google Scholar
  4. 4.
    Reshadi M, Gorjiara B, Gajski D (2005) Utilizing horizontal and vertical parallelism with a no-instruction-set compiler for custom datapaths. In: ICCD, Oct 2005, pp 69–76 Google Scholar
  5. 5.
    Thuresson M, Sjalander M, Bjork M, Svensson L, Larsson-Edefors P, Stenstrom P (2007) FlexCore: Utilizing exposed datapath control for efficient computing. In: IC-SAMOS, July 2007, pp 18–25 Google Scholar
  6. 6.
    Wolfe A, Chanin A (1992) Executing compressed programs on an embedded RISC architecture. In: International Symposium on Microarchitecture, Dec 1992, pp 81–91 Google Scholar
  7. 7.
    Kemp TM, Montoye RK, Harper JD, Palmer JD, Auerbach DJ (1998) A decompression core for PowerPC. IBM J Res Dev 42(6):807–812 CrossRefGoogle Scholar
  8. 8.
    Yang C, Chen M, Orailoglu A (2008) Squashing microcode stores to size in embedded systems while delivering rapid microcode accesses. In: CODES-ISSS, Oct 2009, pp 249–256 Google Scholar
  9. 9.
    Cooper KD, McIntosh N (1999) Enhanced code compression for embedded RISC processors. In: Conference on programming language design and implementation, May 1999, pp. 139–149 Google Scholar
  10. 10.
    Debray SK, Evans W, Muth R, Sutter BD (2000) Compiler techniques for code compaction. ACM Trans on Program Lang Syst 22(2) Google Scholar
  11. 11.
    Segars S, Clarke K, Goudge L (1995) Embedded control problems, thumb, and the ARM7TDMI. IEEE Micro 15(5):22–30 CrossRefGoogle Scholar
  12. 12.
    Grehan R (1999) 16-bit: The good, the bad, your options. Embed Syst Program 12(8) Google Scholar
  13. 13.
    Pechanek GG, Larin S, Conte T (2002) Any-size instruction abbreviation technique for embedded DSPs. In: ASIC/SOC Conference, Sept 2002, pp 8–12 Google Scholar
  14. 14.
    Corliss ML, Lewis EC, Roth A (2003) DISE: a programmable macro engine for customizing applications. In: ISCA, June 2003, pp 362–373 Google Scholar
  15. 15.
    Lau J, Schoenmackers S, Sherwood T, Calder B (2003) Reducing code size with echo instructions. In: CASES, Oct 2003, pp 84–94 Google Scholar
  16. 16.
    Agerwala T (1976) Microprogram optimization: a survey. IEEE Trans Comput 25(10):962–973 zbMATHCrossRefMathSciNetGoogle Scholar
  17. 17.
    Gorjiara B, Gajski D (2007) FPGA-friendly code compression for horizontal microcoded custom IPs. In: FPGA’07, pp 108–115 Google Scholar
  18. 18.
    Borin E, Breternitz M, Wu Y, Araujo G (2007) Clustering-based microcode compression. In: ICCD’07, Oct 2007, pp 189–196 Google Scholar
  19. 19.
    Thuresson M, Sjalander M, Stenstrom P (2009) A flexible code compression scheme using partitioned look-up tables. In: HiPEAC, Jan 2009, pp 95–109 Google Scholar
  20. 20.
    Stewart G (1973) Introduction to matrix computations. Acadamic Press, New York zbMATHGoogle Scholar
  21. 21.
    Bayraktaroglu I, Orailoglu A (2005) The construction of optimal deterministic partitionings in scan-based BIST fault diagnosis: Mathematical foundations and cost-effective implementations. IEEE Trans Comput 54(1):61–75 CrossRefGoogle Scholar
  22. 22.
    Kim D, Lee K, Lee S-J, Yoo H-J (2005) A reconfigurable crossbar switch with adaptive bandwidth control for networks-on-chip. In: ISCAS, Jan 2005, pp 2369–2372 Google Scholar
  23. 23.
    Wan M, Zhang H, George V, Benes M, Abnous A, Prabhu V, Rabaey J (2001) Design methodology of a low-energy reconfigurable single-chip DSP system. J VLSI Signal Process Syst 28:47–61 zbMATHCrossRefGoogle Scholar
  24. 24.
    Thoziyoor S, Muralimanohar N, Ahn JH, Jouppi NP (2008) CACTI 5.1, Tech report, HP Labs, April 2008 Google Scholar

Copyright information

© The Author(s) 2010

Authors and Affiliations

  1. 1.Electrical and Computer Engineering DepartmentUniversity of DelawareNewarkUSA
  2. 2.Computer Science and Engineering DepartmentUniversity of CaliforniaLa Jolla, San DiegoUSA

Personalised recommendations