Skip to main content
Log in

Microcode Compression Using Structured-Constrained Clustering

  • Published:
International Journal of Parallel Programming Aims and scope Submit manuscript

Abstract

Modern microprocessors have used microcode as a way to implement legacy (rarely used) instructions, add new ISA features and enable patches to an existing design. As more features are added to processors (e.g. protection and virtualization), area and power costs associated with the microcode memory increased significantly. A recent Intel internal design targeted at low power and small footprint has estimated the costs of the microcode ROM to approach 20% of the total die area (and associated power consumption). Moreover, with the adoption of multicore architectures, the impact of microcode memory size on the chip area has become relevant, forcing industry to revisit the microcode size problem. A solution to address this problem is to store the microcode in a compressed form and decompress it at runtime. This paper describes techniques for microcode compression that achieve significant area and power savings, while proposes a streamlined architecture that enables high throughput within the constraints of a high performance CPU. The paper presents results for microcode compression on several commercial CPU designs which demonstrates compression ratios ranging from 50 to 62%. In addition, it proposes techniques that enable the reuse of (pre-validated) hardware building blocks that can considerably reduce the cost and design time of the microcode decompression engine in real-world designs.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. Agrawala A.K., Rauscher T.G.: Microprogramming: perspective and status. IEEE Trans. Comput. C-23(8), 817–837 (1974)

    Article  Google Scholar 

  2. Araujo G., Centoducatte P., Azevedo R., Pannain R.: Expression-tree-based algorithms for code compression on embedded RISC architectures. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 8(5), 530–533 (2000). doi:10.1109/92.894158

    Article  Google Scholar 

  3. Araujo, G., Centoducatte, P., Cortes, M., Pannain, R.: Code compression based on operand factorization. In: MICRO ’31: Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture, pp. 194–201. IEEE Computer Society Press, Los Alamitos, CA (1998)

  4. Borin, E., Breternitz, M. Jr., Wu, Y., Araujo, G.: Clustering-based microcode compression. In: ICCD ’06: Proceedings of the XXIV IEEE International Conference on Computer Design, pp. 189–196 (2006)

  5. Breternitz, M. Jr., Smith, R.: Enhanced compression techniques to simplify program decompression and execution. In: ICCD ’97: Proceedings of the 1997 International Conference on Computer Design, pp. 170–176. IEEE Computer Society, Washington, DC (1997)

  6. Dasgupta S.: The organization of microprogram stores. ACM Comput. Surv. 11(1), 39–65 (1979). doi:10.1145/356757.356761

    Article  MATH  Google Scholar 

  7. Fisher J.A.: Trace scheduling: a technique for global microcode compaction. IEEE Trans. Comput. C-30(7), 478–490 (1981)

    Article  Google Scholar 

  8. Frieder, G., Miller, J.: An analysis of code density for the two level programmable control of the Nanodata QM-1. In: MICRO ’10: Proceedings of the 10th Annual Workshop on Microprogramming, pp. 26–32. IEEE Press, Piscataway, NJ (1977)

  9. Garey M.R., Johnson D.S.: Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman, New York (1979)

    MATH  Google Scholar 

  10. Gunter, T.G., Tredennick, H.L.: Two-level control store for microprogrammed data processor. U.S. Patent n. 4,325,121 (1982)

  11. Hum, H., Breternitz, Jr., M., Wu, Y., Kim S.: Compressing microcode. U.S. Patent n. 7,095,342 (2006)

  12. Ishiura, N., Yamaguchi, M.: Instruction code compression for application specific VLIW processors based on automatic field partitioning. In: The Seventh Workshop on Synthesis and System Integration of Mixed technologies, pp. 105–109 (1997)

  13. Kernighan B.W., Lin S.: An efficient heuristic procedure for partitioning graphs. Bell Syst. Tech. J. 49(2), 207–291 (1970)

    Article  Google Scholar 

  14. Lefurgy, C., Bird, P., Chen, I.C., Mudge, T.: Improving code density using compression techniques. In: MICRO ’30: Proceedings of the 30th Annual ACM/IEEE International Symposium on Microarchitecture, pp. 194–203. IEEE Computer Society, Washington, DC (1997)

  15. Lefurgy, C., Piccininni, E., Mudge, T.: Evaluation of a high performance code compression method. In: MICRO ’32: Proceedings of the 32nd Annual ACM/IEEE international Symposium on Microarchitecture, pp. 93–102. IEEE Computer Society, Washington, DC (1999)

  16. Menzilcioglu, O.: A case study in using two-level control stores. In: MICRO ’20: Proceedings of the 20th Annual Workshop on Microprogramming, pp. 142–146. ACM Press, New York, NY (1987). doi:10.1145/255305.255333

  17. Nam S.J., Park I.C., Kyung C.M.: Improving dictionary-based code compression in VLIW architectures. IEICE Trans. Fund. Electron. Commun. Comput. Sci. E82-A(11), 2318–2324 (1999)

    Google Scholar 

  18. Rosin R.F., Frieder G., Eckhouse R.H. Jr: An environment for research in microprogramming and emulation. Commun. ACM. 15(8), 748–760 (1972). doi:10.1145/361532.361550

    Article  MATH  Google Scholar 

  19. Rota G.C.: The number of partitions of a set. Am. Math. Mon. 71(5), 498–504 (1964)

    Article  MATH  MathSciNet  Google Scholar 

  20. Schwartz, S.J.: An algorithm for minimizing read only memories for machine control. In: Conference Record of 1968 Ninth Annual Symposium on Switching and Automata Theory, pp. 28–33 (1968)

  21. Stritter, S., Tredennick, N.: Microprogrammed implementation of a single chip microprocessor. In: MICRO ’11: Proceedings of the 11th Annual Workshop on Microprogramming, pp. 8–16. IEEE Press, Piscataway, NJ (1978)

  22. Tredennick, H.L., Gunter, T.G.: Microprogrammed control apparatus having a two-level control store for data processor. U.S. Patent n. 4,307,445 (1981)

  23. Wilkes, M.V.: The best way to design an automatic calculating machine. In: Manchester University Computer Inaugural Conference, pp. 16–18 (1951)

  24. Wolfe, A., Chanin, A.: Executing compressed programs on an embedded RISC architecture. In: MICRO ’25: Proceedings of the 25th Annual International Symposium on Microarchitecture, pp. 81–91. IEEE Computer Society Press, Los Alamitos, CA (1992). doi:10.1145/144953.145003

  25. Xie, Y., Wolf, W., Lekatsas, H.: Compression ratio and decompression overhead tradeoffs in code compression for VLIW architectures. In: Proceedings of the 4th International Conference on ASIC, pp. 337–340 (2001)

  26. Xie, Y., Wolf, W., Lekatsas, H.: Code compression for VLIW processors using variable-to-fixed coding. In: ISSS ’02: Proceedings of the 15th international symposium on System Synthesis, pp. 138–143 (2002). doi:10.1145/581199.581231

  27. Zhao, W., Papachristou, C.A.: Architectural partitioning of control memory for application specific programmable processors. In: ICCAD ’95: Proceedings of the 1995 IEEE/ACM International Conference on Computer-Aided Design, pp. 521–526. IEEE Computer Society, Washington, DC (1995)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Edson Borin.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Borin, E., Araujo, G., Breternitz, M. et al. Microcode Compression Using Structured-Constrained Clustering. Int J Parallel Prog 42, 140–164 (2014). https://doi.org/10.1007/s10766-012-0206-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10766-012-0206-9

Keywords

Navigation