Advertisement

International Journal of Parallel Programming

, Volume 36, Issue 2, pp 267–286 | Cite as

Modulo Path History for the Reduction of Pipeline Overheads in Path-based Neural Branch Predictors

  • Gabriel H. Loh
  • Daniel A. Jiménez
Article
  • 58 Downloads

Abstract

Neural-inspired branch predictors achieve very low branch misprediction rates. However, previously proposed implementations have a variety of characteristics that make them challenging to implement in future high-performance processors. In particular, the path-based neural predictor (PBNP) and the piecewise-linear (PWL) predictor require deep pipelining and additional area to support checkpointing for misprediction recovery. The complexity of the PBNP predictor stems from the fact that the path history length, which determines the number of tables and pipeline stages, is equal to the history length, which is typically very long for high accuracy. We propose to decouple the path-history length from the outcome-history length through a new technique called modulo-path history. By allowing a shorter path history, we can implement the PBNP and PWL predictors with significantly fewer tables and pipeline stages while still exploiting a traditional long branch outcome history.

Keywords

Computer architecture Branch prediction 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Jiménez, D.A., Lin, C.: Neural methods for dynamic branch prediction. ACM Trans. Comput. Syst. 20(4), 369–397 (2002)CrossRefGoogle Scholar
  2. 2.
    Jiménez, D.A.: Fast path-based neural branch prediction. In: Proceedings of the 36th International Symposium on Microarchitecture, pp. 243–252, San Diego, CA, USA, December 2003Google Scholar
  3. 3.
    Jiménez, D.A.: Piecewise linear branch prediction. In: Proceedings of the 32nd International Symposium on Computer Architecture, pp. 382–393, Madison, WI, USA, June 2005Google Scholar
  4. 4.
    Desmet, V.: Vandierendonck, H., De Bosschere, K.: A 2bcgskew predictor fused by a redundant history skewed perceptron predictor. In: Proceedings of the 1st Championship Branch Prediction Competition, pp. 1–4, Portland, OR, USA, December 2004Google Scholar
  5. 5.
    Loh, G.H.: The Frankenpredictor. In: Proceedings of the 1st Championship Branch Prediction Competition, pp. 1–4, Portland, OR, USA, December 2004Google Scholar
  6. 6.
    Tarjan, D., Skadron, K.: Merging path and gshare indexing in perceptron branch prediction. ACM Trans. Architect. Code Optimization 2(3), 280–300 (2005)CrossRefGoogle Scholar
  7. 7.
    Gao, H., Zhou, H.: Adaptive information processing: an effective way to improve perceptron predictors. In: Proceedings of the 1st Championship Branch Prediction Competition, pp. 1–4, Portland, OR, USA, December 2004Google Scholar
  8. 8.
    Seznec, A., Fraboulet, A.: Effective ahead pipelining of instruction block address generation. In: Proceedings of the 30th International Symposium on Computer Architecture, San Diego, CA, USA, May 2003Google Scholar
  9. 9.
    Co, M., Weikle, D.A.B., Skadron, K.: A break-even formulation for evaluating branch predictor energy efficiency. In: Proceedings of the Workshop on Complexity-Effective Design, Madison, WI, USA, June 2005Google Scholar
  10. 10.
    Yeh, T.-Y., Patt, Y.N.: Two-level adaptive branch prediction. In: Proceedings of the 24th International Symposium on Microarchitecture, pp. 51–61, Albuqueque, NM, USA, November 1991Google Scholar
  11. 11.
    McFarling, S.: Combining Branch Predictors. TN 36, Compaq Computer Corporation Western Research Laboratory (1993)Google Scholar
  12. 12.
    Jiménez, D.A., Keckler, S.W., Lin, C.: The impact of delay on the design of branch predictors. In: Proceedings of the 33rd International Symposium on Microarchitecture, pp. 4–13, Monterey, CA, USA, December 2000Google Scholar
  13. 13.
    Ghose, K., Kamble, M.B.: Reducing power in superscalar processor caches using subbanking, multiple line buffers and bit-line segmentation. In: Proceedings of the International Symposium on Low Power Electronics and Design, pp. 70–75, San Diego, CA, USA, August 1999Google Scholar
  14. 14.
    Seznec, A., Felix, S., Krishnan, V., Sazeides, Y.: Design tradeoffs for the alpha EV8 conditional branch predictor. In: Proceedings of the 29th International Symposium on Computer Architecture, Anchorage, AK, USA, May 2002Google Scholar
  15. 15.
    Stark, J., Evers, M., Patt, Y.N.: Variable length path branch prediction. ACM SIGPLAN Notices 33(11), 170–179 (1998)CrossRefGoogle Scholar
  16. 16.
    Seznec, A.: Analysis of the O-GEometric history length branch predictor. In: Proceedings of the 32nd International Symposium on Computer Architecture, Madison, WI, USA, June 2005Google Scholar
  17. 17.
    Michaud, P.: A PPM-like, tag-based predictor. J. Instr. Level Parallel. 7, 1–10 (2005)Google Scholar
  18. 18.
    Seznec, A., Michaud, P.: A case for (partially) TAgges GEometric history length branch prediction. J. Instr. Level Parallel. 8, 1–23 (2006)Google Scholar
  19. 19.
    Austin, T., Larson, E., Ernst, D.: SimpleScalar: an infrastructure for computer system modeling. IEEE Micro Magaz. pp. 59–67, February 2002Google Scholar
  20. 20.
    Guthaus, M.R., Ringenberg, J.S., Ernst, D., Austin, T.M., Mudge, T., Brown, R.B.: MiBench: a free, commerically representative embedded benchmark suite. In: Proceedings of the 4th Workshop on Workload Characterization, pp. 83–94, Austin, TX, USA, December 2001Google Scholar
  21. 21.
    Lee, C., Potkonjak, M., Mangione-Smith, W.H.: MediaBench: a tool for evaluating and synthesizing multimedia and communication systems. In: Proceedings of the 30th International Symposium on Microarchitecture, pp. 330–335, Research Triangle Park, NC, USA, December 1997Google Scholar
  22. 22.
    Perelman, E., Hamerly, G., Calder, B.: Picking statistically valid and early simulation points. In: Proceedings of the 2003 International Conference on Parallel Architectures and Compilation Techniques, pp. 244–255, New Orleans, LA, USA, September 2004Google Scholar
  23. 23.
    Larson, E., Chatterjee, S., Austin, T.: MASE: a novel infrastructure for detailed microarchitectural modeling. In: Proceedings of the 2001 International Symposium on Performance Analysis of Systems and Software, pp. 1–9, Tucson, AZ, USA, November 2001Google Scholar
  24. 24.
    Gochman, S., Ronen, R., Anati, I., Berkovitz, A., Kurts, T., Naveh, A., Saeed, A., Sperber, Z., Valentine, R.C.: The intel pentium M processor: microarchitecture and performance. Intel Technol. J. 7(2) (2003)Google Scholar
  25. 25.
    Shivakumar, P., Jouppi, N.P.: CACTI 3.0: An Integrated Timing, Power, and Area Model. TR 2001/2, Compaq Computer Corporation Western Research Laboratory (2001)Google Scholar
  26. 26.
    Mulder, J.M., Quach, N.T., Flynn, M.J.: An area model for on-chip memories and its application. IEEE J. Solid-State Circ. 26(2), 98–106 (1991)CrossRefGoogle Scholar
  27. 27.
    The 1st JILP Championship Branch Prediction Competition (CBP-1). http://www.jilp.org/cb. Accessed on 5 Dec 2004

Copyright information

© Springer Science+Business Media, LLC 2008

Authors and Affiliations

  1. 1.College of ComputingGeorgia Institute of TechnologyAtlantaUSA
  2. 2.Department of Computer ScienceRutgers UniversityPiscatawayUSA

Personalised recommendations