Abstract
Several studies have considered reducing instruction cache misses and branch penalty stall cycles by means of various forms of code placement. Most proposed approaches rearrange procedures or basic blocks in order to speed up execution on sequential architectures with branch prediction. Moreover, most works focus mainly on instruction cache performance and disregard execution cycles. To the best of our knowledge, no work has specifically addressed statically scheduled ILP machines like VLIWs, with control-transfer delay slots. We propose a new code positioning algorithm especially designed for VLIW-style architectures, which allows to trade off tighter schedule for program locality. Our measurements indicate that code positioning, as a result of tighter program schedule and removed unconditional jumps, can significantly reduce the number of execution cycles, by up to 21%, while improving program locality and instruction cache performance.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
J.W. Davidson and R.A. Vaughan. The effect of instruction set complexity on program size and memory performance. In ASPLOS-II, pages 60–64, Palo Alto, CA, 1987.
Abraham Mendlson, Shlomit S. Pinter, and Ruth Shtokhamer. Compile time instruction cache optimizations. In Compiler Construction, pages 404–418, April 1994.
W.W. Hwu and P.P. Chang. Achieving high instruction cache performance with an optimizing compiler. In ISCA-16, pages 242–251, Jerusalem, Israel, May 1989.
Nikolas Gloy and Michael D. Smith. Procedure placement using temporal-ordering information. ACM TOPLAS, 21(5):977–1027, September 1999.
Karl Pettis and Robert C. Hansen. Profile guided code positioning. In PLDI, pages 16–27, White Plains, New York, June 1990.
Brad Calder and Dirk Grunwald. Reducing branch costs via branch alignment. In ASPLOSVI, pages 242–251, October 1994.
Cliff Young, David S. Johnson, David R. Karger, and Michael D. Smith. Near-optimal intraprocedural branch alignment. In PLDI, pages 183–193, June 1997.
Jan Hoogerbrugge. Instruction scheduling for trimedia. JILP, 1(1-2), 1999.
Texas Instrument Inc. TMS320C6000 Programmer’s Guide, 2000.
S. McFarling. Program optimization for instruction caches. In ASPLOS-III, pages 183–193, May 1989.
Jan Hoogerbrugge. Code Generation for Transport Triggered Architectures. PhD thesis, Technical University of Delft, February 1996.
Rabin Sugumar. Multi-Configuration Simulation Algorithms for the Evaluation of Computer Architecute Designs. PhD thesis, University of Michigan, August 1993.
Paul M. Embree. C Algorithms for Real-Time DSP. Prentice Hall, 1995.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Cilio, A.G.M., Corporaal, H. (2001). Code Positioning for VLIW Architectures. In: Hertzberger, B., Hoekstra, A., Williams, R. (eds) High-Performance Computing and Networking. HPCN-Europe 2001. Lecture Notes in Computer Science, vol 2110. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-48228-8_34
Download citation
DOI: https://doi.org/10.1007/3-540-48228-8_34
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42293-8
Online ISBN: 978-3-540-48228-4
eBook Packages: Springer Book Archive