Supporting Speculative Multithreading on Simultaneous Multithreaded Processors

  • Venkatesan Packirisamy
  • Shengyue Wang
  • Antonia Zhai
  • Wei-Chung Hsu
  • Pen-Chung Yew
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4297)


Speculative multithreading is a technique that has been used to improve single thread performance. Speculative multithreading architectures for Chip multiprocessors (CMPs) have been extensively studied. But there have been relatively few studies on the design of speculative multithreading for simultaneous multithreading (SMT) processors. The current SMT based designs – IMT [9] and DMT [2] use load/store queue (LSQ) to perform dependence checking. Since the size of the LSQ is limited, this design is suitable only for small threads. In this paper we present a novel cache-based architecture support for speculative simultaneous multithreading which can efficiently handle larger threads. In our architecture, the associativity in the cache is used to buffer speculative values. Our 4-thread architecture can achieve about 15% speedup when compared to the equivalent superscalar processors and about 3% speedup on the average over the LSQ-based architectures, however, with a less complex hardware. Also our scheme can perform 14% better than the LSQ-based scheme for larger threads.


Cache Line Speculative Load Chip Multiprocessor Speculative Thread Dependence Violation 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Open research compiler for itanium processor family,
  2. 2.
    Akkary, H., Driscoll, M.: A dynamic multithreading processor. In: MICRO-31 (December 1998)Google Scholar
  3. 3.
    Colohan, C.B., Ailamaki, A., Steffan, J.G., Mowry, T.C.: Tolerating dependences between large speculative threads via sub-threads. In: Proceedings of the 33th ISCA, Boston, MA (2006)Google Scholar
  4. 4.
    Franklin, M., Sohi, G.S.: ARB: A hardware mechanism for dynamic reordering of memory references. IEEE Transactions on Computers 45(5) (May 1996)Google Scholar
  5. 5.
    Krishnan, V., Torrellas, J.: A chip multiprocessor architecture with speculative multithreading. IEEE Transactions on Computers, Special Issue on Multithreaded Architecture (September 1999)Google Scholar
  6. 6.
    Luk, C.-K., Cohn, R., Muth, R., Patil, H., Klauser, A., Lowney, G., Wallace, S., Reddi, V.J., Hazelwood, K.: Pin: building customized program analysis tools with dynamic instrumentation. In: Proc. ACM SIGPLAN 2005 Conference on Programming Language Design and Implementation (2005)Google Scholar
  7. 7.
    Marcuello, P., Gonzalez, A.: Exploiting speculative thread-level parallelism on a smt processor. In: Proceedings of the 7th International Conference on High-Performance Computing and Networking (April 1999)Google Scholar
  8. 8.
    Martnez, J.F., Renau, J., Huang, M.C., Prvulovic, M., Torrellas, J.: Cherry: checkpointed early resource recycling in out-of-order microprocessors. In: Proceedings of Micro-35, Istanbul, Turkey (2002)Google Scholar
  9. 9.
    Park, I., Falsafi, B., Vijaykumar, T.N.: Implicitly-multithreaded processors. In: Proceedings of the 30th ISCA (June 2003)Google Scholar
  10. 10.
    Steffan, J.G., Colohan, C.B., Zhai, A., Mowry, T.C.: A scalable approach to thread-level speculation. In: Proceedings of the 27th ISCA (June 2000)Google Scholar
  11. 11.
    Steffan, J.G., Colohan, C.B., Zhai, A., Mowry, T.C.: The stampede approach to thread-level speculation. In: ACM Transactions on Computer Systems (TOCS), vol. 23 (August 2005)Google Scholar
  12. 12.
    Tullsen, D., Eggers, S., Emer, J., Levy, H., Lo, J., Stamm, R.: Exploiting choice: Instruction fetch and issue on an implementable simultaneous multithreading processor. In: Proceedings of the 23rd ISCA (May 1996)Google Scholar
  13. 13.
    Vijaykumar, T.N., Gopal, S., Smith, J.E., Sohi, G.: Speculative versioning cache. IEEE Transactions on Parallel and Distributed Systems 12, 1305–1317 (2001)CrossRefGoogle Scholar
  14. 14.
    Wang, S., Dai, X., Yellajyosula, K.S., Zhai, A.: Loop selection for thread-level speculation. In: Ayguadé, E., Baumgartner, G., Ramanujam, J., Sadayappan, P. (eds.) LCPC 2005. LNCS, vol. 4339, pp. 289–303. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  15. 15.
    Zhai, A., Colohan, C.B., Steffan, J.G., Mowry, T.C.: Compiler optimization of scalar value communication between speculative threads. In: Proceedings of the 10th ASPLOS (October 2002)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Venkatesan Packirisamy
    • 1
  • Shengyue Wang
    • 1
  • Antonia Zhai
    • 1
  • Wei-Chung Hsu
    • 1
  • Pen-Chung Yew
    • 1
  1. 1.Department of Computer ScienceUniversity of MinnesotaMinneapolis

Personalised recommendations