Branch Optimisation Techniques for Hardware Compilation

  • Henry Styles
  • Wayne Luk
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2778)


This paper explores using information about program branch probabilities to optimise reconfigurable designs. The basic premise is to promote utilization by dedicating more resources to branches which execute more frequently. A hardware compilation system has been developed for producing designs which are optimised for different branch probabilities. We propose an analytical queueing network performance model to determine the best design from observed branch probability information. The branch optimisation space is characterized in an experimental study for Xilinx Virtex FPGAs of two complex applications: video feature extraction and progressive refinement radiosity. For designs of equal performance, branch-optimised designs require 24% and 27.5% less area. For designs of equal area, branch optimised designs run upto 3 times faster. Our analytical performance model is shown to be highly accurate with relative error between 0.12 and 1.1 x 10− 4.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Celoxica Limited, Handel-C Language Reference Manual, version 3.1, document number RM-1003-3.0 (2002)Google Scholar
  2. 2.
    De Micheli, G.: Synthesis and Optimization of Digital Circuits. McGraw-Hill, New York (1994)Google Scholar
  3. 3.
    Harriss, T., Walke, R., Kienhuis, B., Deprettere, E.: Compilation from Matlab to process networks. Design Automation for Embedded Systems 7, 385–403 (2002)CrossRefGoogle Scholar
  4. 4.
    Mencer, O., Huebert, H., Morf, M., Flynn, M.J.: StReAm: Object-oriented programming of stream architectures using PAM-Blox. In: Grünbacher, H., Hartenstein, R.W. (eds.) FPL 2000. O. Mencer, H. Huebert, M. Morf and M.J. Flynn, vol. 1896, pp. 595–604. Springer, Heidelberg (2000)CrossRefGoogle Scholar
  5. 5.
    Mitrani, I.: Probabalistic Modelling. Cambridge University Press, Cambridge (1998)zbMATHGoogle Scholar
  6. 6.
    Moller, T., Trumbore, B.: Fast, minimum storage ray-triangle intersection. Journal of Graphics Tools 2(1), 21–28 (1997)CrossRefGoogle Scholar
  7. 7.
    Onvural, R.O.: Survey of closed queueing networks with blocking. ACM Computing Surveys 22(2), 83–121 (1990)CrossRefGoogle Scholar
  8. 8.
    Styles, H., Luk, W.: Accelerating radiosity calculations using reconfigurable platforms. In: Proc. IEEE Symp. on Field-Programmable Custom Computing Machines, pp. 279–281 (2002)Google Scholar
  9. 9.
    Veen, A.H.: Dataflow machine architecture. ACM Computing Surveys 18(4), 365–396 (1986)CrossRefGoogle Scholar
  10. 10.
    Weinhardt, M., Luk, W.: Pipeline vectorisation. IEEE Trans. on Comput.-Aided Design 20(2), 234–248 (2001)CrossRefGoogle Scholar
  11. 11.
    Ziegler, H., So, B., Hall, M., Diniz, P.C.: Coarse-grain pipelining on multiple FPGA architectures. In: Proc. IEEE Symp. on Field-Programmable Custom Computing Machines, pp. 77–86 (2002)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2003

Authors and Affiliations

  • Henry Styles
    • 1
  • Wayne Luk
    • 1
  1. 1.Department of ComputingImperial CollegeLondonEngland

Personalised recommendations