Optimizations on Array Skeletons in a Shared Memory Environment

  • Clemens Grelck
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2312)


Map- and fold-like skeletons are a suitable abstractions to guide parallel program execution in functional array processing. However, when it comes to achieving high performance, it turns out that confining compilation efforts to individual skeletons is insufficient. This paper proposes compilation schemes which aim at reducing runtime overhead due to communication and synchronization by embedding multiple array skeletons within a so-called spmd meta skeleton. Whereas the meta skeleton exclusively takes responsibility for the organization of parallel program execution, the original array skeletons are focussed to their individual numerical operation. While concrete compilation schemes assume multithreading in a shared memory environment as underlying execution model, ideas can be carried over to other settings straightforwardly. Preliminary performance investigations help to quantify potential benefits.


Data Dependency Parallel Execution Program Execution Numerical Operation Runtime Overhead 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    J.C. Adams, W.S. Brainerd, J.T. Martin, B.T. Smith, and J.L. Wagener. Fortran-95 Handbook—Complete ANSI/ISO Reference. Scientific and Engineering Computation. MIT Press, 1997.Google Scholar
  2. 2.
    G.H. Botorog and H. Kuchen. Efficient High-Level Parallel Programming. Theoretical Computer Science, 196(1–2):71–107, 1998.zbMATHCrossRefGoogle Scholar
  3. 3.
    B.L. Chamberlain, S.-E. Choi, E.C. Lewis, C. Lin, L. Snyder, and W.D. Weathersby. Factor-Join: A Unique Approach to Compiling Array Languages for Parallel Machines. In D.C. Sehr, U. Banerjee, D. Gelernter, A. Nicolau, and D.A. Padua, editors, Proceedings of the 9th Workshop on Languages and Compilers for Parallel Computing (LCPC’96), San José, California, USA, volume 1239 of Lecture Notes in Computer Science, pages 481–500. Springer-Verlag, 1997.Google Scholar
  4. 4.
    B.L. Chamberlain, S.-E. Choi, E.C. Lewis, C. Lin, L. Snyder, and W.D. Weathersby. The Case for High Level Parallel Programming in ZPL. IEEE Computational Science and Engineering, 5(3):76–86, 1998.CrossRefGoogle Scholar
  5. 5.
    W. Chin. Towards an Automated Tupling Strategy. In Proceedings of the ACM SIGPLAN Symposium on Partial Evaluation and Semantic-Based Program Manipulation (PEPM’97), Copenhagen, Denmark, pages 119–132. ACM Press, 1993.Google Scholar
  6. 6.
    W. Chin. Fusion and Tupling Transformations: Synergies and Conflicts. In Proceedings of the Fuji International Workshop on Functional and Logic Programming, Susono, Japan, pages 106–125. World Scientific Publishing, 1995.Google Scholar
  7. 7.
    M.I. Cole. Algorithmic Skeletons: Structured Management of Parallel Computation. Reserach Monographs in Parallel and Distributed Computing. Pitman, 1989.Google Scholar
  8. 8.
    J. Darlington, A.J. Field, P.G. Harrison, P.H.J. Kelly, D.W.N. Sharp, Q. Wu, and R.L. While. Parallel Programming using Skeleton Functions. In Proceedings of the Conference on Parallel Architectures and Reduction Languages Europe (PARLE’93), volume 694 of Lecture Notes in Computer Science, pages 146–160. Springer-Verlag, 1993.Google Scholar
  9. 9.
    A. Gill, J. Launchbury, and S.L. Peyton Jones. A Short Cut to Deforestation. In Proceedings of the Conference on Functional Programming Languages and Computer Architecture (FPCA’93), Copenhagen, Denmark, pages 223–232. ACM Press, 1993.Google Scholar
  10. 10.
    S. Gorlatch and C. Lengauer. (De)Composition Rules for Parallel Scan and Reduction. In Proceedings of the 3rd International Working Conference on Massively Parallel Programming Models (MPPM’97), London, UK, pages 23–32. IEEE Computer Society Press, 1997.Google Scholar
  11. 11.
    S. Gorlatch and S. Pelagatti. A Transformational Framework for Skeletal Programs: Overview and Case Study. In J. Rohlim et al., editors, Parallel and Distributed Processing. IPPS/SPDP’99 Workshops Proceedings, volume 1586 of Lecture Notes in Computer Science, pages 123–137. Springer-Verlag, 1999.Google Scholar
  12. 12.
    S. Gorlatch, C. Wedler, and C. Lengauer. Optimization Rules for Programming with Collective Operations. In M. Atallah, editor, Proceedings of the 13th International Parallel Processing Symposium and the 10th Symposium on Parallel and Distributed Processing (IPPS/SPDP’99), San Juan, Puerto Rico, pages 492–499, 1999.Google Scholar
  13. 13.
    C. Grelck. Shared Memory Multiprocessor Support for SAC. In K. Hammond, T. Davie, and C. Clack, editors, Proceedings of the 10th International Workshop on Implementation of Functional Languages (IFL’98), London, UK, selected papers, volume 1595 of Lecture Notes in Computer Science, pages 38–54. Springer-Verlag, 1999.Google Scholar
  14. 14.
    C. Grelck. Implicit Shared Memory Multiprocessor Support for the Functional Programming Language SAC-Single Assignment C. PhD thesis, University of Kiel, Kiel, Germany, 2001. Logos Verlag, Berlin, 2001.Google Scholar
  15. 15.
    C. Grelck, D. Kreye, and S.-B. Scholz. On Code Generation for Multi-Generator WITH-Loops in SAC. In P. Koopman and C. Clack, editors, Proceedings of the 11th International Workshop on Implementation of Functional Languages (IFL’99), Lochem, The Netherlands, selected papers, volume 1868 of Lecture Notes in Computer Science, pages 77–94. Springer-Verlag, 2000.Google Scholar
  16. 16.
    W. Gropp, E. Lusk, and A. Skjellum. Using MPI: Portable Parallel Programming with the Message Passing Interface. MIT Press, Cambridge, Massachusetts, USA, 1994.Google Scholar
  17. 17.
    E. Hagersten and M. Koster. WildFire: A Scalable Path for SMPs. In Proceedings of the 5th International Conference on High-Performance Computer Architecture (HPCA’99), Orlando, Florida, USA, pages 172–181. IEEE Computer Society Press, 1999.Google Scholar
  18. 18.
    K. Hammond and G. Michaelson. Research Directions in Parallel Functional Programming. Springer-Verlag, 1999.Google Scholar
  19. 19.
    H. Han, C.-W. Tseng, and P. Keleher. Eliminating Barrier Synchronization for Compiler-Parallelized Codes on Software DSMs. International Journal of Parallel Programming, 26(5):591–612, 1998.CrossRefGoogle Scholar
  20. 20.
    M.F.P. O’Boyle (HP), L. Kervella (HP), and F. Bodin. Sronisation Mininimisation in a SPMD Execution Mode. Journal of Parallel and Distributed Computing, 29(2):196–210, 1995.CrossRefGoogle Scholar
  21. 21.
    Z. Hu, H. Iwasaki, M. Takeichi, and A. Takano. Tupling Calculation Eliminates Multiple Data Traversals. In Proceedings of the ACM SIGPLAN International Conference on Functional Programming (ICFP’97), Amsterdam, The Netherlands. ACM Press, 1997.Google Scholar
  22. 22.
    F.A. Rabhi. Exploiting Parallelism in Functional Languages: A “Paradigm-Oriented” Approach. In T. Lake and P. Dew, editors, Abstract Machine Models for Highly Parallel Computers. Oxford University Press, 1993.Google Scholar
  23. 23.
    S.-B. Scholz. On Defining Application-Specific High-Level Array Operations by Means of Shape-Invariant Programming Facilities. In S. Picchi and M. Micocci, editors, Proceedings of the International Conference on Array Processing Languages (APL’98), Rome, Italy, pages 40–45. ACM Press, 1998.Google Scholar
  24. 24.
    S.-B. Scholz. With-loop-folding in SAC—Condensing Consecutive Array Operations. In C. Clack, K. Hammond, and T. Davie, editors, Proceedings of the 9th International Workshop on Implementation of Functional Languages (IFL’97), St. Andrews, Scotland, UK, selected papers, volume 1467 of Lecture Notes in Computer Science, pages 72–92. Springer-Verlag, 1998.Google Scholar
  25. 25.
    P. Trinder, K. Hammond, H.-W. Loidl, and S.L. Peyton Jones. Algorithm + Strategy = Parallelism. Journal of Functional Programming, 8(1):23–60, 1998.zbMATHCrossRefMathSciNetGoogle Scholar
  26. 26.
    P. Wadler. Deforestation: Transforming Programs to Eliminate Trees. Theoretical Computer Science, 73(2):231–248, 1990.zbMATHCrossRefMathSciNetGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2002

Authors and Affiliations

  • Clemens Grelck
    • 1
  1. 1.Institute for Software Technology and Programming LanguagesMedical University of LübeckLübeckGermany

Personalised recommendations