The Challenges of Efficient Code-Generation for Massively Parallel Architectures

  • Jason M McGuiness
  • Colin Egan
  • Bruce Christianson
  • Guang Gao
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4186)


Overcoming the memory wall [15] may be achieved by increasing the bandwidth and reducing the latency of the processor to memory connection, for example by implementing Cellular architectures, such as the IBM Cyclops. Such massively parallel architectures have sophisticated memory models. In this paper we used DIMES (the Delaware Iterative Multiprocessor Emulation System), developed by CAPSL at the University of Delaware, as a hardware evaluation tool for cellular architectures. The authors contend that there is an open question regarding the potential, ideal approach to parallelism from the programmer’s perspective. For example, at language-level such as UPC or HPF, or using trace-scheduling, or at a library-level, for example OpenMP or POSIX-threads. To investigate this, we have chosen to use a threaded Mandelbrot-set generator with a work-stealing algorithm to evaluate the DIMES cthread programming model for writing a simple multi-threaded program.


Memory Model Cellular Architecture Work Thread Distribute Processing Symposium Memory Consistency Model 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Almásil, G., Cascaval, C., Castaños, J.G., Denneau, M., Lieber, D., Moreira, J.E., Warren, H.S.: Dissecting Cyclops: Detailed Analysis of a Multithreaded Architecture. ACM SIGARCH Computer Architecture News 31 (March 2003)Google Scholar
  2. 2.
    Cascaval, C., Castaños, J.G., Ceze, L., Denneau, M., Gupta, M., Lieber, D., Moreira, J.E., Strauss, K., Warren, H.S.: Evaluation of a Multithreaded Architecture for Cellular Computing. In: 8th International Symposium on High-Performance Computer Architecture (HPCA) (2002)Google Scholar
  3. 3.
    Cavalherio, G.G.H., Doreille, M., Galilée, F., Gautier, T., Roch, J.-L.: Scheduling Parallel Programs on Non-Uniform Memory Architectures. In: HPCA Conference – Workshop on Parallel Computing for Irregular Applications WPCIA1, Orlando, USA (January 1999)Google Scholar
  4. 4.
    del Cuvillo, J.B., Zhu, W., Hu, Z., Gao, G.R.: FAST: A Functionally Accurate Simulation Toolset for the Cyclops-64 Cellular Architecture. In: Workshop on Modeling, Benchmarking and Simulation (MoBS), held in conjunction with the 32nd Annual International Symposium on Computer Architecture (ISCA 2005), Madison, Wisconsin, June 4 (2005)Google Scholar
  5. 5.
    del Cuvillo, J.B., Zhu, W., Hu, Z., Gao, G.R.: TiNy Threads: a Thread Virtual Machine for the Cyclops64 Cellular Architecture. In: Fifth Workshop on Massively Parallel Processing (WMPP), held in conjunction with the 19th International Parallel and Distributed Processing System, Denver, Colorado, April 3 - 8 (2005)Google Scholar
  6. 6.
    Duller, A., Towner, D., Panesar, G., Gray, A., Robbins, W.: picoArray technology: the tool’s story. In: Proceedings of the Design, Automation and Test in Europe Conference and Exhibition. IEEE, Los Alamitos (2005)Google Scholar
  7. 7.
    Gao, G.R., Sarkar, V.: Location Consistency - a New Memory Model and Cache Consistency Protocol. IEEE Transactions on Computers 49(8) (August 2000)Google Scholar
  8. 8.
    Gao, G.R., Theobald, K.B., Govindarajan, R., Leung, C., Hu, Z., Wu, H., Lu, J., del Cuvillo, J., Jacquet, A., Janot, V., Sterling, T.L.: Programming Models and System Software for Future High-End Computing Systems: Work-in-Progress. In: International Parallel and Distributed Processing Symposium (IPDPS 2003), Nice, France, April 22 - 26 (2003)Google Scholar
  9. 9.
    El-Ghazawi, T.A., Carlson, W.W., Draper, J.M.: UPC Language Specifications V1.1.1 (October 2003)Google Scholar
  10. 10.
    Kakulavarapu, P., Morrone, C.J., Theobald, K., Amaral, J.N., Gao, G.R.: A Comparative Performance Study of Fine-Grain Multi-threading on Distributed Memory Machines. In: 19th IEEE International Performance, Computing and Communication Conference-IPCCC 2000, Phoenix, Arizona, USA, February 20-22 (2000)Google Scholar
  11. 11.
    McGuiness, J.M.: A DIMES Demonstration Application: Mandelbrot-Set Generation Using a Work-Stealing Algorithm. CAPSL Technical Note 11, Department of Electrical and Computer Engineering, University of Delaware, Newark, Delaware (June 2003),
  12. 12.
    Mandelbrot, B.B.: The Fractal Geometry of Nature. W.H.Freeman & Co., New York (1982)MATHGoogle Scholar
  13. 13.
    Rodenas, D., Martorell, X., Ayguade, E., Labarta, J., Almasi, G., Cascaval, C., Castanos, J., Moreira, J.: Optimizing NANOS OpenMP for the IBM Cyclops Multithreaded Architecture. In: 19th IEEE International Parallel and Distributed Processing Symposium, vol. 1, p. 110 (2005)Google Scholar
  14. 14.
    Sakane, H., Yakay, L., Karna, V., Leung, C., Gao, G.R.: DIMES: An Iterative Emulation Platform for Multiprocessor-System-on-Chip Designs. In: IEEE International Conference on Field-Programmable Technology, Tokyo, Japan, December 15-17 (2003)Google Scholar
  15. 15.
    Wulf, W., McKee, S.: Hitting the memory wall: Implications of the obvious. Computer Architecture News 23(1), 20–24 (1995)CrossRefGoogle Scholar
  16. 16.
    Zhang, Y., Zhu., W., Chen, F., Hu, Z., Gao, G.R.: Sequential Consistency Revisited: The Sufficient Conditions and Method to Reason Consistency Model of a Multiprocessor-on-a chip Architecture. In: The IASTED International Conference on Parallel and Distributed Computing and Networks (PDCN2005), Innsbruck, Austria, February 15 - 17 (2005)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Jason M McGuiness
    • 1
  • Colin Egan
    • 1
  • Bruce Christianson
    • 1
  • Guang Gao
    • 2
  1. 1.Department of Compiler Technology and Computer ArchitectureUniversity of HertfordshireHatfield, HertfordshireU.K.
  2. 2.CAPSLUniversity of DelawareDelawareU.S.A.

Personalised recommendations