Preview
Unable to display preview. Download preview PDF.
Bibliography
General DASH papers
Daniel Lenoski, Jim Laudon, Truman Joe, Dave Nakahira, Luis Stevens, Anoop Gupta, and John Hennessy. The DASH Prototype: Implementation and Performance. In Proceedings of 19th International Symposium on Computer Architecture. May, 1992. To appear.
Daniel Lenoski, James Laudon, Kourosh Gharachorloo, Wolf-Dietrich Weber, Anoop Gupta, John Hennessy, Mark Horowitz, and Monica Lam. The Stanford DASH Multiprocessor. IEEE Computer 25(3), March, 1992.
Daniel Lenoski. The Design and Analysis of DASH: A Scalable Directory-Based Multiprocessor. Technical Report CSL-TR-92-507, Computer Systems Laboratory, Stanford University, 1992.
DASH Coherence Protocol and Directory Structure
Anoop Gupta and Wolf-Dietrich Weber. Cache Invalidation Patterns in Shared-Memory Multiprocessors. 1992. To appear in IEEE Transaction on Computers.
Anoop Gupta, Wolf-Dietrich Weber, and Todd Mowry. Reducing Memory and Traffic Requirements for Scalable Directory-Based Cache Coherence Schemes. In Proceedings of International Conference on Parallel Processing. August, 1990.
Daniel Lenoski, James Laudon, Kourosh Gharachorloo, Anoop Gupta and John Hennessy. The Directory-Based Cache Coherence Protocol for the DASH Multiprocessor. In Proceedings of 17th International Symposium on Computer Architecture. May, 1990.
Architecture tools
Helen Davis, Stephen Goldschmidt and John Hennessy. Multiprocessor Simulation and Tracing using Tango. In Proceedings of the 1991 International Conference on Parallel Processing. August, 1991.
Latency hiding and tolerating techniques
Anoop Gupta, John Hennessy, Kourosh Gharachorloo, Todd Mowry, and Wolf-Dietrich Weber. Comparative Evaluation of Latency Reducing and Tolerating Techniques. In Proceedings of 18th International Symposium on Computer Architecture. May, 1991.
Kourosh Gharachorloo, Anoop Gupta, John Hennessy. Hiding Memory Latency Using Dynamic Scheduling in Shared-Memory Multiprocessors. In Proceedings of 19th International Symposium on Computer Architecture. May, 1992. To appear.
Kourosh Gharachorloo, Anoop Gupta, and John Hennessy. Performance Evaluation of Memory Consistency Models for Shared-Memory Multiprocessors. In Proceedings of Fourth International Conference on Architectural Support for Programming Languages and Operating Systems. April, 1991.
Kourosh Gharachorloo, Daniel Lenoski, James Laudon, Phillip Gibbons, Anoop Gupta and John Hennessy. Memory Consistency and Event Ordering in Scalable Shared-Memory Multiprocessors. In Proceedings of 17th International Symposium on Computer Architecture. May, 1990.
Todd Mowry and Anoop Gupta. Tolerating Latency Through Software-Controlled Prefetching in Shared-Memory Multiprocessors. Journal of Parallel and Distributed Computing 12(6), June, 1991.
Wolf-Dietrich Weber and Anoop Gupta. Exploring the Benefits of Multiple Hardware Contexts in a Multiprocessor Architecture: Preliminary Results. In Proceedings of 16th International Symposium on Computer Architecture. June, 1989.
Other architectural studies
Per Stenstrom, Truman Joe, and Anoop Gupta. Comparative Performance Evaluation of Cache-Coherent NUMA and COMA Architectures. In Proceedings of 19th International Symposium on Computer Architecture. May, 1992. To appear.
Operating systems
Josep Torrellas, Anoop Gupta, and John Hennessy. Characterizing the Cache Performance and Synchronization Behavior of a Multiprocessor Operating System. Technical Report CSL-TR-92-502, Computer Systems Laboratory, Stanford University, January, 1992.
Anoop Gupta, Andrew Tucker, and Luis Stevens. Making Effective Use of Shared-Memory Multiprocessors: The Process Control Approach. Technical Report CSL-TR-91-475, Computer Systems Laboratory, Stanford University, May, 1991.
Anoop Gupta, Andrew Tucker, and Shigeru Urushibara. The Impact of Operating System Scheduling Policies and Synchronization Methods on the Performance of Parallel Applications. In Proceedings of ACM SIGMETRICS. May, 1991.
Andrew Tucker and Anoop Gupta. Process Control and Scheduling Issues for Multiprogrammed Shared-Memory Multiprocessors. In Proceedings of 12th ACM Symposium on Operating Systems Principles. December, 1989.
Programming languages
Rohit Chandra, Anoop Gupta and John Hennessy. Integrating Concurrency and Data Abstraction in a Parallel Programming Language. Technical Report CSL-TR-92-511, Computer Systems Laboratory, Stanford University, February, 1992.
Rohit Chandra, Anoop Gupta, and John Hennessy. COOL: A Language for Parallel Programming. Research Monographs in Parallel and Distributed Computing. Languages and Compilers for Parallel Computing. Edited by Chris Jesshope and David Klappholz, The MIT Press, 1990.
Monica Lam and Martin Rinaid. Coarse-Grain Parallel Programming in Jade. In Proceedings of Third ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. April, 1991.
Martin Rinaid and Monica Lam. Semantic Foundations of Jade. In Proc. 19th Annual ACM Symposium on Principles of Programming Languages. Jan, 1992.
Compilers
Michael Wolf and Monica Lam. A Loop Transformation Theory and Algorithm to Maximize Parallelism. IEEE Trans. on Parallel and Distributed Systems, Oct, 1991.
Dror Maydan, John Hennessy, and Monica Lam. Efficient and Exact Data Dependence Analysis. In Proc. ACM SIGPLAN 91 Conference on Programming Language Design and Implementation. Jun, 1991.
Michael Wolf and Monica Lam. A Data Locality Optimizing Algorithm. In Proc. ACM SIGPLAN 91 Conference on Programming Language Design and Implementation. Jun, 1991.
Monica Lam, Edward Rothberg, and Michael Wolf. The Cache Performance and Optimizations of Blocked Algorithms. In Fourth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS IV). Apr., 1991.
Performance tools
Margaret Martonosi, Anoop Gupta, and Tom Anderson. MemSpy: Analyzing Memory System Bottlenecks in Programs. In Proceedings of ACM SIGMETRICS. May, 1992. To appear.
Aaron Goldberg and John Hennessy. Performance Debugging Shared Memory Multiprocessor Programs with MTOOL. In Proceedings of Supercomputing '91. November, 1991.
Applications
Jaswinder Pal Singh, Wolf-Dietrich Weber, and Anoop Gupta. SPLASH: Stanford Parallel Applications for Shared Memory. Technical Report CSL-TR-91-469, Computer Systems Laboratory, Stanford University, April, 1991.
Larry Soule and Anoop Gupta. An Evaluation of Chandy-Misra-Bryant Algorithm for Digital Logic Simulation. 1992. To appear in ACM Transactions on Modeling and Computer Simulation.
Jaswinder Pal Singh, Chris Holt, Takashi Totsuka, Anoop Gupta and John Hennessy. Load Balancing and Data Locality in Hierarchical N-Body Methods. Technical Report CSL-TR-92-505, Computer Systems Laboratory, Stanford University, February, 1992.
Jaswinder Pal Singh, John Hennessy, and Anoop Gupta. Implications of Hierarchical N-Body Methods for Multiprocessor Architecture. Technical Report CSL-TR-92-506, Computer Systems Laboratory, Stanford University, February, 1992.
Jaswinder Pal Singh and John Hennessy. Finding and Exploiting Parallelism in an Ocean Simulation Program: Experience, Results and Implications. 1992. To appear in Journal of Parallel and Distributed Computing.
Edward Rothberg and Anoop Gupta. The Performance Impact of Data Reuse in Parallel Dense Cholesky Factorization. Technical Report STAN-CS-92-1401, Computer Science Department, Stanford University, January, 1992.
Edward Rothberg and Anoop Gupta. An Evaluation of Left-Looking, Right-Looking, and Multifrontal Approaches to Sparse Cholesky Factorization on Hierarchical-Memory Machines. 1992. To appear in International Journal of High Speed Computing. Also available as Stanford University technical report STAN-CS-91-1377/CSL-TR-91-487, August 1991.
Edward Rothberg and Anoop Gupta. Parallel ICCG on a Hierarchical Memory Multiprocessor-Addressing the Triangular Solve Bottleneck. 1992. To appear in Parallel Computing.
Edward Rothberg and Anoop Gupta. Techniques for Improving the Performance of Sparse Matrix Factorization on Multiprocessor Workstations. In Proceedings of Supercomputing '90. November, 1990.
Edward Rothberg and Anoop Gupta. Efficient Sparse Matrix Factorization on High-Performance Workstations-Exploiting the Memory Hierarchy. ACM Transactions on Mathematical Software 17(3), September, 1991.
Margaret Martonosi and Anoop Gupta. Tradeoffs in Message-Passing and Shared-Memory Implementations of a Standard Cell Router. In Proceedings of International Conference on Parallel Processing. August, 1989.
Edward Rothberg and Anoop Gupta. Experiences Implementing a Parallel ATMS on a Shared-Memory Multiprocessor. In International Joint Conference on Artificial Intelligence. August, 1989.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1992 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Gupta, A. (1992). Stanford DASH multiprocessor: The hardware and software approach. In: Etiemble, D., Syre, JC. (eds) PARLE '92 Parallel Architectures and Languages Europe. PARLE 1992. Lecture Notes in Computer Science, vol 605. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-55599-4_125
Download citation
DOI: https://doi.org/10.1007/3-540-55599-4_125
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-55599-5
Online ISBN: 978-3-540-47250-6
eBook Packages: Springer Book Archive