Design Alternatives of Multithreaded Architecture

Mendelson, Avi; Bekerman, Michael

doi:10.1023/A:1018733528538

Design Alternatives of Multithreaded Architecture

Published: June 1999

Volume 27, pages 161–193, (1999)
Cite this article

International Journal of Parallel Programming Aims and scope Submit manuscript

Avi Mendelson &
Michael Bekerman

71 Accesses
Explore all metrics

Abstract

This paper compares two possible implementations of multithreaded architecture and proposes a new architecture combining the flexibility of the first with the low hardware complexity of the second. We present performance and step-by-step complexity analysis of two design alternatives of multithreaded architecture: dynamic inter-thread resource scheduling and static resource allocation. We then introduce a new multithreaded architecture based on a new scheduling mechanism called the “semi-static.” We show that with two concurrent threads the dynamic scheduling processor achieves from 5 to 45 % higher performance at the cost of much more complicated design. This paper indicates that for a relatively high number of execution resources the complexity of the dynamic scheduling logic will inevitably require design compromises. Moreover, high chip-wide communication time and an incomplete bypassing network will limit the dynamic scheduling and reduce its performance advantage. On the other hand, static scheduling architecture achieves low resource utilization. The semi-static architecture utilizes compiler techniques to exploit patterns of program parallelism and introduces a new hardware mechanism, in order to achieve performance close to dynamic scheduling without significantly increasing the static hardware complexity. The semi-static architecture statically assigns part of the functional units but dynamically schedules the most performance-critical functional units on a medium-grain basis.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

REFERENCES

D. W. Wall, Limits on Instruction-Level Parallelism, ASPLOS IV: Fourth Int'l. Conf. on Architectural Support for Progr. Lang. and Operat. Syst., pp. 176–188 (April 1991).
D. M. Tullsen, S. J. Eggers, J. S. Emer, H. M. Levy, J. L. Lo, and R. L. Stamm, Exploiting Choice: Instruction Fetch and Issue on an Implementable Simultaneous Multithreading Processor, Proc. 23rd Ann. Int'l. Symp. Computer Architecture, pp. 191–202 (1996).
M. Johnson, Superscalar Microprocessor Design, Prentice-Hall, Englewood Cliffs, New Jersey (1991).
Google Scholar
L. Gwennap, PA-8000 Combines Complexity and Speed, Microprocessor Report, 8(15) (November 14, 1994).
Peter Christy, IA-64 and Merced––What and Why, Microprocessor Report, 10(17) (December 1996).
Carole Dulong, The IA-64 Architecture at Work, Computer, 31(7):24–32 (July 1998).
Google Scholar
J. A. Fisher, Very Long Instruction Word Architecture and the Eli-512, Proc. 10th Ann. Symp. Computer Architecture, pp. 140–150 (June 1983).
Z. Rozenshein, STAR*CORE: A Scalable High performance DSP Architecture, Motorola, Microprocessor forum (October 14, 1998).
A. Agarwal, J. Kubiatowicz, D. Kranz, B.-H. Lim, D. Yeung, G. D'souza, and M. Parkin, Sparcle: An Evolutionary Processor Design for Large-Scale Multiprocessors, IEEE Micro, 13(3):48–61 (June 1993).
Google Scholar
G. S. Sohi, S. E. Breach, and T. N. Vijaykumar, Multiscalar Processors, Proc. 22nd Ann. Int'l. Symp. Computer Architecture (1995).
R. Alverson, D. Callahan, D. Cummings, B. Koblenz, A. Porterfield, B. Smith, The Tera Computer System, Proc. Int'l. Conf. Supercomputing, pp. 1–6 (June 1990).
B. K. Gunther, Multithreading with Distributed Functional Units, IEEE Trans. Computers, 46(4):399–411 (April 1997).
Google Scholar
B. H. Krishna and R. Govindarajan, Performance Evaluation of Simultaneous Multithreaded Architectures, Proc. Fourth Int'l. Conf. '97), pp. 34–43.
H. H. J. Hum, O. Maquelin, K. B. Theobald, X. Tian, G. R. Gao, and L. J. Hendren, A study of the EARTH-MANNA multithreaded system, IJPP, 24(4):319–347 (August 1996).
Google Scholar
A. Wolfe and J. P. Shen, A Variable Instruction Stream Extension to the VLIW Architecture, ASPLOS IV: Fourth Int'l. Conf. Architectural Support for Progr. Lang. Operat. Syst., pp. 2–14 ( April 1991).
G. Tyson, M. Farrens, and A. R. Pleszkun, MISC: A Multiple Instruction Stream Computer. MICRO-25, Proc. 25th Int'l. Symp. Microarchitecture, pp. 193–196 (December 1992).
D. M. Tullsen, S. J. Eggers, and H. M. Levy, Simultaneous Multithreading: Maximizing On-Chip Parralelism, Proc. 22nd Ann. Int'l. Symp. Computer Architecture, pp. 392–403 (1995).
M. Bekerman, A. Mendelson, and G. Sheaffer, Performance and Hardware Complexity Tradeoffs in Designing Multithreaded Architectures, Conf. Parallel Architect. Compilation Techniques (PACT 96), pp. 24–34 (1996).
Haitham Akkary and Michael A. Driscoll, A Dynamic Multithreaded Processor, MICRO-31, Proc. 31st Int'l. Symp. Microarchitecture, pp. 226–236 (November 1998).
G. E. Daddis and H. C. Torng, The Concurrent Execution of Multiple Instruction Streams on Superscalar Processors, Proc. Int'l. Conf. Parallel Processing, I:76–83 (August 1991).
Google Scholar
H. Hirata, K. Kimura, S. Nagamine, Y. Mochizuki, A. Nishimura, Y. Nakase, and T. Nishizawa, An Elementary Processor Architecture with Simultaneous Instruction Issuing from Multiple Threads, Proc. 19th Ann. Int'l. Symp. Computer Architecture, pp. 136–145 (1992).
S. W. Keckler and W. J. Dally, Processor Coupling: Integrating Compile Time and Run-Time Scheduling for Parallelism, Proc. 18th Ann. Int'l. Symp. on Computer Architecture, pp. 202–213 (May 1992).
R. E. Hank, S. A. Mahlke, R. A. Bringmann, J. C. Gyllenhaal, and W. W. Hwu, Superblock Formation Using Static Program Analysis, MICRO-26, Proc. 26th Int'l. Symp. Microarchitecture, Austin, Texas (December 1993).
D. A. Patterson and J. L. Hennessy, Computer Architecture: A Quantitative Approach, Morgan Kaufmann Publishers, Inc., 1990.
L. Gwennap, Intel's P6 Uses Decoupled Superscalar Design, Microprocessor Report, 9(2) (February 16, 1995).
J. E. Smith and A. R. Pleszkun, Implementation of Precise Interrupts in Pipelined Processor, Proc. 12th Ann. Int' l. Symp. Computer Architecture, Piscataway, New Jersey, pp. 36–44 (1985).
R. Tomasulo, An Efficient Algorithm for Exploiting Multiple Arithmetic Units, IBM J. 11:25–33 (January 1967).
Google Scholar
G. S. Sohi, Instruction Issue Logic for High-Performance, Interruptible, Multiple Functional Units, Pipelined Computers, IEEE Trans. Computers, 39(3) (March 1990).
J. P. Singh, W.-D. Weber, and A. Gupta, SPLASH: Stanford Parallel Applications for Shared-Memory, Computer Architecture News, 20(1):5–44 (March 1992).
Google Scholar
SPEC Newsletter, Vol. 1 (1989).
Shade User's Manual, UNIX Manual Pages for Shade Analyzer and Library Functions, Spix Tools Users Manual, SpixTools, SUN Microsystems (1992).
B. R. Rau, Data Flow and Dependence Analysis for Instruction Level Parallelism, Proc. Fourth Int'l. Workshop on Lang. Compilers for Parallel Computing, Lecture Notes in Computer Science (LNCS ), 589:236–250 (August 1991).
Google Scholar
P. P. Chang, S. A. Mahlke, and W. W. Hwu, Using Profile Information to Assist Classic Code Optimization, Software Practice and Experience, 21:1301–1321 (December 1991).
Google Scholar
A. Mendelson and B. Mendelson, Toward a General-Purpose Multi-Stream System, Proc. IFIP Working Conf. Parallel Architectures and Compilation Techniques (PACT 94), pp. 335–338 (1994).

Download references

Authors

Avi Mendelson
View author publications
You can also search for this author in PubMed Google Scholar
Michael Bekerman
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Mendelson, A., Bekerman, M. Design Alternatives of Multithreaded Architecture. International Journal of Parallel Programming 27, 161–193 (1999). https://doi.org/10.1023/A:1018733528538

Download citation

Issue Date: June 1999
DOI: https://doi.org/10.1023/A:1018733528538

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Design Alternatives of Multithreaded Architecture

Abstract

Access this article

Similar content being viewed by others

Conclusions

Runtime-Aware Architectures

Weaving Parallel Threads

REFERENCES

Rights and permissions

About this article

Cite this article

Navigation

Design Alternatives of Multithreaded Architecture

Abstract

Access this article

Similar content being viewed by others

Conclusions

Runtime-Aware Architectures

Weaving Parallel Threads

REFERENCES

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation