Abstract
Energy-efficient dynamic branch predictors are proposed for the Cell SPE, which normally depends on compiler-inserted hint instructions to predict branches. All designed schemes use a Branch Target Buffer (BTB) to store the branch target address and the prediction, which is computed using a bimodal counter. One prediction scheme pre-decodes instructions when they are fetched from the local store and accesses the BTB only for branch instructions, thereby saving power compared to conventional dynamic predictors that access the BTB for every instruction. In addition, several ways to leverage the existing hint instructions for the dynamic branch predictor are studied. We also introduce branch warning instructions which initiate branch prediction before the actual branch instruction is fetched. They allow fetching the instructions starting at the branch target and thus completely remove the branch penalty for correctly predicted branches. For a 256-entry BTB, a speedup of up to 18.8% is achieved. The power consumption of the branch prediction schemes is estimated at 1% or less of the total power dissipation of the SPE and the average energy-delay product is reduced by up to 6.2%.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Hofstee, H.: Power Efficient Processor Architecture and the Cell Processor. In: Proc. Int. Symp. on High-Performance Computer Architecture, HPCA (2005)
Gou, C., Kuzmanov, G., Gaydadjiev, G.N.: Sams: Single-affiliation multiple-stride parallel memory scheme. In: Proc. Workshop on Memory Access on Future Processors: a Solved Problem? (2008)
Meenderinck, C., Juurlink, B.: Specialization of the Cell SPE for Media Applications. In: Proc. Int. Conf on Application-Specific Systems, Architectures and Processors (2009)
Flachs, B., et al.: Microarchitecture and Implementation of the Synergistic Processor in 65-nm and 90-nm SOI. IBM Journal of Research and Development 51(5) (2007)
Cabarcas, F., Rico, A., Rodenas, D., Martorell, X., Ramirez, A., Ayguade, E.: CellSim: A Validated Modular Heterogeneous Multiprocessor Simulator. In: XVIII Jornadas de Paralelismo (2006)
Bader, D., Agarwal, V., Madduri, K., Kang, S.: High Performance Combinatorial Algorithm Design on the Cell Broadband Engine processor. Parallel Computing 33(10-11) (2007)
Bader, D., Agarwal, V., Madduri, K.: On the Design and Analysis of Irregular Algorithms on the Cell Processor: A Case Study of List Ranking. In: Proc. IEEE/ACM Int. Parallel and Distributed Processing Symp. (2007)
Azevedo, A., Meenderinck, C., Juurlink, B., Alvarez, M., Ramirez, A.: Analysis of Video Filtering on the Cell Processor. In: Proc. Int. Symp. on Circuits and Systems (2008)
Gschwind, M., Hofstee, H., Flachs, B., Hopkins, M., Watanabe, Y., Yamazaki, T.: Synergistic Processing in Cell’s Multicore Architecture. IEEE Micro 26(2) (2006)
Wang, D.T.: ISSCC 2008 Cell Processor update. Real World Technologies (2008)
Riley, M., et al.: Implementation of the 65nm Cell Broadband Engine. In: Proc. Custom Integrated Circuits Conference (2007)
Thoziyoor, S., Muralimanohar, N., Ahn, J.H., Jouppi, N.P.: CACTI 5.1. Technical report, HP Laboratories (2008)
Kahn, R., Weiss, S.: Thrifty BTB: A Comprehensive Solution for Dynamic Power Reduction in Branch Target Buffers. Microprocessors & Microsystems 32(8) (2008)
Parikh, D., Skadron, K., Zhang, Y., Barcella, M., Stan, M.: Power Issues Related to Branch Prediction. In: Proc. Int. Symp. on High-Performance Computer Architecture (2002)
Yang, C., Orailoglu, A.: Power Efficient Branch Prediction through Early Identification of Branch Addresses. In: Proc. Int. Conf. on Compilers, Architecture and Synthesis for Embedded Systems (2006)
Chaver, D., Pinuel, L., Prieto, M., Tirado, F., Huang, M.: Branch Prediction on Demand: an Energy-Efficient Solution. In: Proc. Int. Symp. on Low power Electronics and Design (2003)
Monchiero, M., Palermo, G., Sami, M., Silvano, C., Zaccaria, V., Zafalon, R.: Low-Power Branch Prediction Techniques for VLIW Architectures: a Compiler-Hints Based Approach. Integration VLSI Journal 38(3) (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Briejer, M., Meenderinck, C., Juurlink, B. (2010). Extending the Cell SPE with Energy Efficient Branch Prediction. In: D’Ambra, P., Guarracino, M., Talia, D. (eds) Euro-Par 2010 - Parallel Processing. Euro-Par 2010. Lecture Notes in Computer Science, vol 6271. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15277-1_29
Download citation
DOI: https://doi.org/10.1007/978-3-642-15277-1_29
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15276-4
Online ISBN: 978-3-642-15277-1
eBook Packages: Computer ScienceComputer Science (R0)