Tolerating Branch Predictor Latency on SMT

Falcón, Ayose; Santana, Oliverio J.; Ramírez, Alex; Valero, Mateo

doi:10.1007/978-3-540-39707-6_7

Ayose Falcón⁸,
Oliverio J. Santana⁸,
Alex Ramírez⁸ &
…
Mateo Valero⁸

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2858))

Included in the following conference series:

International Symposium on High Performance Computing

572 Accesses
5 Citations

Abstract.

Simultaneous Multithreading (SMT) tolerates latency by executing instructions from multiple threads. If a thread is stalled, resources can be used by other threads. However, fetch stall conditions caused by multi-cycle branch predictors prevent SMT to achieve all its potential performance, since the flow of fetched instructions is halted.

This paper proposes and evaluates solutions to deal with the branch predictor delay on SMT. Our contribution is two-fold: we describe a decoupled implementation of the SMT fetch unit, and we propose an inter-thread pipelined branch predictor implementation. These techniques pro-ve to be effective for tolerating the branch predictor access latency.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Agarwal, V., Hrishikesh, M., Keckler, S., Burger, D.: Clock rate versus IPC: The end of the road for conventional microarchitectures. In: Procs. of the 27th Intl. Symp. on Computer Architecture (June 2000)
Google Scholar
Ho, R., Mai, K., Horowitz, M.: The future of wires. In: Proceedings of the IEEE (April 2001)
Google Scholar
Hrishikesh, M., Burger, D., Keckler, S., Shivakumar, P., Jouppi, N., Farkas, K.: The optimal logic depth per pipeline stage is 6 to 8 FO4 inverter delays. In: Procs. of the 29th Intl. Symp. on Computer Architecture (May 2002)
Google Scholar
Jacobson, Q., Rottenberg, E., Smith, J.: Path-based next trace prediction. In: Procs. of the 30th Intl. Symp. on Microarchitecture (December 1997)
Google Scholar
Jiménez, D., Keckler, S., Lin, C.: The impact of delay on the design of branch predictors. In: Procs. of the 33rd Intl. Symp. on Microarchitecture (December 2000)
Google Scholar
Jiménez, D.: Reconsidering complex branch predictors. In: Procs. of the 9th Intl. Conf. on High Performance Computer Architecture (February 2003)
Google Scholar
Michaud, P., Seznec, A., Uhlig, R.: Trading conflict and capacity aliasing in conditional branch predictors. In: Procs. of the 24th Intl. Symp. on Computer Architecture (June 1997)
Google Scholar
Ramirez, A., Santana, O.J., Larriba-Pey, J.L., Valero, M.: Fetching instructions streams. In: Procs. of the 35th Intl. Symp. on Microarchitecture (November 2002)
Google Scholar
Reinman, G., Calder, B., Austin, T.: Optimizations enabled by a decoupled front-end architecture. IEEE Trans. on Computers 50(4), 338–355 (2001)
Article Google Scholar
Seznec, A., Fraboulet, A.: Effective ahead pipelining of instruction block address generation. In: Procs. of the 30th Intl. Symp. on Computer Architecture (June 2003)
Google Scholar
Sherwood, T., Perelman, E., Calder, B.: Basic block distribution analysis to find periodic behavior and simulation points in applications. In: Procs. of the Intl. Conf. on Parallel Architectures and Compilation Techniques (September 2001)
Google Scholar
Shivakumar, P., Jouppi, N.: CACTI 3.0, an integrated cache timing, power and area model. TR 2001/2, Compaq WRL (August 2001)
Google Scholar
Tullsen, D.: Simulation and modeling of a simultaneous multithreading processor. In: 22nd Computer Measurement Group Conference (December 1996)
Google Scholar
Tullsen, D., Eggers, S., Emer, J., Levy, H., Lo, J., Stamm, R.: Exploiting choice: Instruction fetch and issue on an implementable simultaneous multithreading processor. In: Procs. of the 23rd Intl. Symp. on Computer Architecture (May 1996)
Google Scholar
Tullsen, D., Eggers, S., Levy, H.: Simultaneous multithreading: Maximizing onchip parallelism. In: Procs. of the 22nd Intl. Symp. on Computer Architecture (June 1995)
Google Scholar
Yamamoto, W., Nemirovsky, M.: Increasing superscalar performance through multistreaming. In: Procs. of the Intl. Conf. on Parallel Architectures and Compilation Techniques (June 1995)
Google Scholar

Download references

Author information

Authors and Affiliations

Departament d’Arquitectura de Computadors, Universitat Politècnica de Catalunya, Barcelona, Spain
Ayose Falcón, Oliverio J. Santana, Alex Ramírez & Mateo Valero

Authors

Ayose Falcón
View author publications
You can also search for this author in PubMed Google Scholar
Oliverio J. Santana
View author publications
You can also search for this author in PubMed Google Scholar
Alex Ramírez
View author publications
You can also search for this author in PubMed Google Scholar
Mateo Valero
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, University of California (UCI), 3019 Donald Bren Hall, 92697-3435, Irvine, CA, USA
Alex Veidenbaum
Department of Information and Computer Science, Faculty of Science, Nara women’s University, Kitauoyanishi-machi, Nara-city, 630-8506, Nara, Japan
Kazuki Joe
Keio University, Hiyoshi, Kohoku, Yokohama, 223–8522, Kanagawa, Japan
Hideharu Amano
Tokyo University of Technology, 1404-1 Katakura, Hachioji, 192-0982, Tokyo, Japan
Hideo Aiso

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Falcón, A., Santana, O.J., Ramírez, A., Valero, M. (2003). Tolerating Branch Predictor Latency on SMT. In: Veidenbaum, A., Joe, K., Amano, H., Aiso, H. (eds) High Performance Computing. ISHPC 2003. Lecture Notes in Computer Science, vol 2858. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-39707-6_7

Download citation

DOI: https://doi.org/10.1007/978-3-540-39707-6_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-20359-9
Online ISBN: 978-3-540-39707-6
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics