Performance of MP3D on the SB-PRAM Prototype
The SB-PRAM is a shared memory machine which hides latency by simple interleaved context switching and which can be expected to behave almost exactly like a PRAM if all threads can be kept busy. We report measured run times of various versions of the MP3D benchmark on the completed hardware of a 64 processor SB-PRAM. The main findings of these experiments are: 1) parallel efficiency is 79% for 32 processors and 56% for 64 processors. 2) Parallel efficiency is limited by the number of available threads.
Unable to display preview. Download preview PDF.
- 1.Cray Research, Inc. Cray T3D System Architecture Overview, March 1993.Google Scholar
- 4.A. Agarwal, R. Bianchini, D. Chaiken, K. Johnson, D. Kranz, J. Kubiatowicz, B.-H. Lim, K. Mackenzie, and D. Yeung. The MIT alewife machine: Architecture and performance. In Proc. of the 22nd Annual Int’l Symp. on Computer Architecture (ISCA’95), pages 2–13, 1995.Google Scholar
- 5.R. Alverson, D. Callahan, D. Cummings, B. Koblenz, A. Porterfield, and B. Smith. The Tera computer system. In Proc. of the 1990 International Conference on Supercomputing, pages 1–6, 1990.Google Scholar
- 6.A. Formella, T. Grun, J. Keller, W. Paul, T. Rauber, and G. Runger. Scientific applications on the SB-PRAM. In Proc. of International Conference on Multi-Scale Phenomena and Their Simulation, pages 272–281. World Scientific, Singapore, 1997.Google Scholar
- 9.J. Pal Singh, W. Weber, and A. Gupta. SPLASH: Stanford Parallel Applications for Shared-Memory. Technical Report CSL-TR-91-469, Stanford University, 1991.Google Scholar
- 10.W. J. Paul, P. Bach, M. Bosch, J. Fischer, C. Lichtenau, and J. Roehrig. Real PRAM Programming. In Proc. of the Europar’02, 2002.Google Scholar
- 11.G. Pfister, W. Brantley, D. George, S. Harvey, W. Kleinfelder, K. McAuliffe, E. Melton, V. Norton, and J. Weiss. The IBM research parallel processor prototype. In Proc. Int.l Conf. on Parallel Processing 764–771, 1985.Google Scholar
- 12.A. G. Ranade. The Fluent Abstract Machine. Technical Report TR-12 BA87-3, 1987.Google Scholar
- 13.T. Rauber, G. Runger, and C. Scholtes. Shared-memory implementation of an irregular particle simulation method. In Proc. of the EuroPar’96, number 1123 in Springer LNCS., pages 822–827, 1996.Google Scholar
- 14.J. Roehrig. Implementierung der P4-Laufzeitbibliothek auf der SB-PRAM. Master’s thesis, Universitaet des Saarlandes, Saarbruecken, 1996.Google Scholar
- 15.J. Wilson. Operating System Data Structures for Shared-Memory MIMD Machines with Fetch-and-Add. PhD thesis, 1988.Google Scholar
- 16.S. Woo, M. Ohara, E. Torrie, J. Singh, and A. Gupta. The SPLASH-2 programs: Characterization and methodological considerations. In Proc. of the 22nd International Symposium on Computer Architecture, pages 24–38, 1995.Google Scholar