Abstract
Modern microprocessors have used microcode as a way to implement legacy (rarely used) instructions, add new ISA features and enable patches to an existing design. As more features are added to processors (e.g. protection and virtualization), area and power costs associated with the microcode memory increased significantly. A recent Intel internal design targeted at low power and small footprint has estimated the costs of the microcode ROM to approach 20% of the total die area (and associated power consumption). Moreover, with the adoption of multicore architectures, the impact of microcode memory size on the chip area has become relevant, forcing industry to revisit the microcode size problem. A solution to address this problem is to store the microcode in a compressed form and decompress it at runtime. This paper describes techniques for microcode compression that achieve significant area and power savings, while proposes a streamlined architecture that enables high throughput within the constraints of a high performance CPU. The paper presents results for microcode compression on several commercial CPU designs which demonstrates compression ratios ranging from 50 to 62%. In addition, it proposes techniques that enable the reuse of (pre-validated) hardware building blocks that can considerably reduce the cost and design time of the microcode decompression engine in real-world designs.
Similar content being viewed by others
References
Agrawala A.K., Rauscher T.G.: Microprogramming: perspective and status. IEEE Trans. Comput. C-23(8), 817–837 (1974)
Araujo G., Centoducatte P., Azevedo R., Pannain R.: Expression-tree-based algorithms for code compression on embedded RISC architectures. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 8(5), 530–533 (2000). doi:10.1109/92.894158
Araujo, G., Centoducatte, P., Cortes, M., Pannain, R.: Code compression based on operand factorization. In: MICRO ’31: Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture, pp. 194–201. IEEE Computer Society Press, Los Alamitos, CA (1998)
Borin, E., Breternitz, M. Jr., Wu, Y., Araujo, G.: Clustering-based microcode compression. In: ICCD ’06: Proceedings of the XXIV IEEE International Conference on Computer Design, pp. 189–196 (2006)
Breternitz, M. Jr., Smith, R.: Enhanced compression techniques to simplify program decompression and execution. In: ICCD ’97: Proceedings of the 1997 International Conference on Computer Design, pp. 170–176. IEEE Computer Society, Washington, DC (1997)
Dasgupta S.: The organization of microprogram stores. ACM Comput. Surv. 11(1), 39–65 (1979). doi:10.1145/356757.356761
Fisher J.A.: Trace scheduling: a technique for global microcode compaction. IEEE Trans. Comput. C-30(7), 478–490 (1981)
Frieder, G., Miller, J.: An analysis of code density for the two level programmable control of the Nanodata QM-1. In: MICRO ’10: Proceedings of the 10th Annual Workshop on Microprogramming, pp. 26–32. IEEE Press, Piscataway, NJ (1977)
Garey M.R., Johnson D.S.: Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman, New York (1979)
Gunter, T.G., Tredennick, H.L.: Two-level control store for microprogrammed data processor. U.S. Patent n. 4,325,121 (1982)
Hum, H., Breternitz, Jr., M., Wu, Y., Kim S.: Compressing microcode. U.S. Patent n. 7,095,342 (2006)
Ishiura, N., Yamaguchi, M.: Instruction code compression for application specific VLIW processors based on automatic field partitioning. In: The Seventh Workshop on Synthesis and System Integration of Mixed technologies, pp. 105–109 (1997)
Kernighan B.W., Lin S.: An efficient heuristic procedure for partitioning graphs. Bell Syst. Tech. J. 49(2), 207–291 (1970)
Lefurgy, C., Bird, P., Chen, I.C., Mudge, T.: Improving code density using compression techniques. In: MICRO ’30: Proceedings of the 30th Annual ACM/IEEE International Symposium on Microarchitecture, pp. 194–203. IEEE Computer Society, Washington, DC (1997)
Lefurgy, C., Piccininni, E., Mudge, T.: Evaluation of a high performance code compression method. In: MICRO ’32: Proceedings of the 32nd Annual ACM/IEEE international Symposium on Microarchitecture, pp. 93–102. IEEE Computer Society, Washington, DC (1999)
Menzilcioglu, O.: A case study in using two-level control stores. In: MICRO ’20: Proceedings of the 20th Annual Workshop on Microprogramming, pp. 142–146. ACM Press, New York, NY (1987). doi:10.1145/255305.255333
Nam S.J., Park I.C., Kyung C.M.: Improving dictionary-based code compression in VLIW architectures. IEICE Trans. Fund. Electron. Commun. Comput. Sci. E82-A(11), 2318–2324 (1999)
Rosin R.F., Frieder G., Eckhouse R.H. Jr: An environment for research in microprogramming and emulation. Commun. ACM. 15(8), 748–760 (1972). doi:10.1145/361532.361550
Rota G.C.: The number of partitions of a set. Am. Math. Mon. 71(5), 498–504 (1964)
Schwartz, S.J.: An algorithm for minimizing read only memories for machine control. In: Conference Record of 1968 Ninth Annual Symposium on Switching and Automata Theory, pp. 28–33 (1968)
Stritter, S., Tredennick, N.: Microprogrammed implementation of a single chip microprocessor. In: MICRO ’11: Proceedings of the 11th Annual Workshop on Microprogramming, pp. 8–16. IEEE Press, Piscataway, NJ (1978)
Tredennick, H.L., Gunter, T.G.: Microprogrammed control apparatus having a two-level control store for data processor. U.S. Patent n. 4,307,445 (1981)
Wilkes, M.V.: The best way to design an automatic calculating machine. In: Manchester University Computer Inaugural Conference, pp. 16–18 (1951)
Wolfe, A., Chanin, A.: Executing compressed programs on an embedded RISC architecture. In: MICRO ’25: Proceedings of the 25th Annual International Symposium on Microarchitecture, pp. 81–91. IEEE Computer Society Press, Los Alamitos, CA (1992). doi:10.1145/144953.145003
Xie, Y., Wolf, W., Lekatsas, H.: Compression ratio and decompression overhead tradeoffs in code compression for VLIW architectures. In: Proceedings of the 4th International Conference on ASIC, pp. 337–340 (2001)
Xie, Y., Wolf, W., Lekatsas, H.: Code compression for VLIW processors using variable-to-fixed coding. In: ISSS ’02: Proceedings of the 15th international symposium on System Synthesis, pp. 138–143 (2002). doi:10.1145/581199.581231
Zhao, W., Papachristou, C.A.: Architectural partitioning of control memory for application specific programmable processors. In: ICCAD ’95: Proceedings of the 1995 IEEE/ACM International Conference on Computer-Aided Design, pp. 521–526. IEEE Computer Society, Washington, DC (1995)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Borin, E., Araujo, G., Breternitz, M. et al. Microcode Compression Using Structured-Constrained Clustering. Int J Parallel Prog 42, 140–164 (2014). https://doi.org/10.1007/s10766-012-0206-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10766-012-0206-9