Code Positioning for VLIW Architectures

  • Andrea G. M. Cilio
  • Henk Corporaal
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2110)


Several studies have considered reducing instruction cache misses and branch penalty stall cycles by means of various forms of code placement. Most proposed approaches rearrange procedures or basic blocks in order to speed up execution on sequential architectures with branch prediction. Moreover, most works focus mainly on instruction cache performance and disregard execution cycles. To the best of our knowledge, no work has specifically addressed statically scheduled ILP machines like VLIWs, with control-transfer delay slots. We propose a new code positioning algorithm especially designed for VLIW-style architectures, which allows to trade off tighter schedule for program locality. Our measurements indicate that code positioning, as a result of tighter program schedule and removed unconditional jumps, can significantly reduce the number of execution cycles, by up to 21%, while improving program locality and instruction cache performance.


Basic Block Cache Size Cache Line Instruction Cache Code Position 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    J.W. Davidson and R.A. Vaughan. The effect of instruction set complexity on program size and memory performance. In ASPLOS-II, pages 60–64, Palo Alto, CA, 1987.Google Scholar
  2. 2.
    Abraham Mendlson, Shlomit S. Pinter, and Ruth Shtokhamer. Compile time instruction cache optimizations. In Compiler Construction, pages 404–418, April 1994.Google Scholar
  3. 3.
    W.W. Hwu and P.P. Chang. Achieving high instruction cache performance with an optimizing compiler. In ISCA-16, pages 242–251, Jerusalem, Israel, May 1989.Google Scholar
  4. 4.
    Nikolas Gloy and Michael D. Smith. Procedure placement using temporal-ordering information. ACM TOPLAS, 21(5):977–1027, September 1999.CrossRefGoogle Scholar
  5. 5.
    Karl Pettis and Robert C. Hansen. Profile guided code positioning. In PLDI, pages 16–27, White Plains, New York, June 1990.Google Scholar
  6. 6.
    Brad Calder and Dirk Grunwald. Reducing branch costs via branch alignment. In ASPLOSVI, pages 242–251, October 1994.Google Scholar
  7. 7.
    Cliff Young, David S. Johnson, David R. Karger, and Michael D. Smith. Near-optimal intraprocedural branch alignment. In PLDI, pages 183–193, June 1997.Google Scholar
  8. 8.
    Jan Hoogerbrugge. Instruction scheduling for trimedia. JILP, 1(1-2), 1999.Google Scholar
  9. 9.
    Texas Instrument Inc. TMS320C6000 Programmer’s Guide, 2000.Google Scholar
  10. 10.
    S. McFarling. Program optimization for instruction caches. In ASPLOS-III, pages 183–193, May 1989.Google Scholar
  11. 11.
    Jan Hoogerbrugge. Code Generation for Transport Triggered Architectures. PhD thesis, Technical University of Delft, February 1996.Google Scholar
  12. 12.
    Rabin Sugumar. Multi-Configuration Simulation Algorithms for the Evaluation of Computer Architecute Designs. PhD thesis, University of Michigan, August 1993.Google Scholar
  13. 13.
    Paul M. Embree. C Algorithms for Real-Time DSP. Prentice Hall, 1995.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2001

Authors and Affiliations

  • Andrea G. M. Cilio
  • Henk Corporaal
    • 1
  1. 1.Computer Architecture and Digital Techniques Dept.Delft University of TechnologyDelftThe Netherlands

Personalised recommendations