Higher-Order and Symbolic Computation

, Volume 16, Issue 3, pp 203–251 | Cite as

Comparing Parallel Functional Languages: Programming and Performance

  • H.-W. Loidl
  • F. Rubio
  • N. Scaife
  • K. Hammond
  • S. Horiguchi
  • U. Klusik
  • R. Loogen
  • G.J. Michaelson
  • R. Peña
  • S. Priebe
  • Á.J. Rebón
  • P.W. Trinder


This paper presents a practical evaluation and comparison of three state-of-the-art parallel functional languages. The evaluation is based on implementations of three typical symbolic computation programs, with performance measured on a Beowulf-class parallel architecture.

We assess three mature parallel functional languages: PMLS, a system for implicitly parallel execution of ML programs; GPH, a mainly implicit parallel extension of Haskell; and Eden, a more explicit parallel extension of Haskell designed for both distributed and parallel execution. While all three languages employ a completely implicit approach to communication, each language takes a different approach to specifying and controlling parallelism, ranging from explicit identification of processes as language constructs (Eden) through annotation of potential parallelism (GPH) to automatic detection of parallel skeletons in sequential code (PMLS).

We present detailed performance measurements of all three systems on a widely available parallel architecture: a Beowulf cluster of low-cost commodity workstations. We use three representative symbolic applications: a matrix multiplication algorithm, an exact linear system solver, and a simple ray-tracer. Our results show how moderate speedups can be achieved with little or no changes to the sequential code, and that parallel performance can be significantly improved even within our high-level model of parallel functional programming by controlling key aspects of the program such as load distribution and thread granularity.

parallel computation functional programming skeletons, implicit parallelism automatic task decomposition load balancing Haskell ML 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Bacci, B., Danelutto, M., Orlando, S., Pelagatti, S., and Vanneschi, M. P3L: A structured high level programming language and its structured support. Concurrency—Practice and Experience, 7(3) (1995) 225–255.Google Scholar
  2. 2.
    Bacci, B., Gorlatch, S., Lengauer, C., and Pelagatti, S. Skeletons and transformations in an integrated parallel programming environment. In PACT'99—Intl. Conf. on Parallel Architecture and Compilations Techniques, Vol. 1662 of LNCS, 1999, pp. 13–27.Google Scholar
  3. 3.
    Blelloch, G. Programming parallel algorithms. Communications of the ACM, 39(3) (1996) 85–97.Google Scholar
  4. 4.
    Blelloch, G., Miller, G., and Talmor, D. Developing a practical projection-based parallel delaunay algorithm. In Symp. on Computational Geometry. Philadelphia, PA, 1996, pp. 186–195.Google Scholar
  5. 5.
    Blelloch, G. and Narlikar, G. A Practical comparison of N-body algorithms. In Parallel Algorithms, Vol. 30 of Series in Discrete Mathematics and Theoretical Computer Science. American Mathematical Society, 1997.Google Scholar
  6. 6.
    Breitinger, S., Klusik, U., Loogen, R., Ortega, Y., and Peña, R. DREAM—the DistRibuted Eden Abstract Machine. In IFL'97—Intl. Workshop on the Implementation of Functional Languages 1997. Vol. 1467 of LNCS, St. Andrews, Scotland, 1997, pp. 250–269.Google Scholar
  7. 7.
    Breitinger, S., Loogen, R., Ortega, Y., and Peña, R. The Eden coordination model for distributed memory systems. In HIPS'97—Workshop on High-Level Parallel Programming Models. Geneva, Switzerland, 1997, pp. 120–124.Google Scholar
  8. 8.
    Burge, W. Recursive Programming Techniques. Addison-Wesley, 1975.Google Scholar
  9. 9.
    Cann, D. Retire fortran? A debate rekindled. Communications of the ACM, 35(8) (1992) 81–89.Google Scholar
  10. 10.
    Chakravarty M., Keller, G., Lechtchinsky, R., and Pfannenstiel, W. Nepal—Nested data-parallelism in Haskell. In EuroPar'01—European Conf. on Parallel Processing, Vol. 2150 of LNCS. Manchester, UK, Aug. 28–31, 2001, pp. 524–534.Google Scholar
  11. 11.
    Cole, M.I. Algorithmic Skeletons: Structured Management of Parallel Computationg. Research Monographs in Parallel and Distributed Computing. Cambridge, MA: The MIT Press, 1989.Google Scholar
  12. 12.
    Cook, A. Transformation and proof in a parallelising compiler. Ph.D. thesis, Dept. of Computing and Electrical Engineering, Heriot-Watt University, 2002.Google Scholar
  13. 13.
    Darlington, J., Guo, Y., and To, H. Structured parallel programming: Theory meets practice. In Research Directions in Computer Science, R. Milner and I. Wand (Eds.), Cambridge University Press, 1996a.Google Scholar
  14. 14.
    Darlington, J., Guo, Y., To, H., and Yang, J. SPF: Structured parallel fortran. In PCW'96—Intl. Parallel Computing Workshop. Kawasaki, Japan, 1996b.Google Scholar
  15. 15.
    Doligez, D. and Leroy, X. A concurrent, generational garbage collector for a multithreaded implementation of ML. In POPL'93—Symp. on Principles of Programming Languages. Charleston, SC, Jan. 1993, pp. 113–123.Google Scholar
  16. 16.
    Du Bois, A., Loidl, H.-W., and Trinder, P. Thread migration in a parallel graph reducer. In IFL'02—Intl. Workshop on the Implementation of Functional Languages, Vol. 2670 of LNCS. Madrid, Spain, Sept. 16–18, 2002.Google Scholar
  17. 17.
    Flanagan, C. and Nikhil, R. pHluid: The design of a parallel functional language implementation on workstations. In ICFP'96—Intl. Conf. on Functional Programming. Philadelphia, PA, May 24–26, 1996, pp. 169–179.Google Scholar
  18. 18.
    Frens, J. and Wise, D. Auto-blocking matrix multiplication, or tracking BLAS3 performance from source code. PPoPP'97—Symp. on Principles and Practice of Parallel Programming, 32(7) (1997) 206–216.Google Scholar
  19. 19.
    Goldberg, B. Multiprocessor execution of functional programs. Intl. J. of Parallel Programming, 17(5) (1988) 425–473.Google Scholar
  20. 20.
    Halstead, R. Multilisp: A language for concurrent symbolic computation. ACM Transactions on Programming Languages and Systems, 7(4) (1985) 106–117.Google Scholar
  21. 21.
    Hamdan, M. A combinational framework for parallel programming using algorithmic skeletons. Ph.D. thesis, Dept. of Computing and Electrical Engineering, Heriot-Watt University, 2000.Google Scholar
  22. 22.
    Hammes, J., Lubeck, O., and Böhm, W. Comparing Id and Haskell in a Monte Carlo photon transport code. J. of Functional Programming, 5(3) (1995) 283–316.Google Scholar
  23. 23.
    Hammond, K., King, D., Loidl, H.-W., Rebón, A., and Trinder, P. The HasPar performance evaluation suite for GPH: A parallel non-strict functional language. Technical report, 2000.Google Scholar
  24. 24.
    Hammond, K., Loidl, H.-W., and Partridge, A. Visualising granularity in parallel programs: A graphical winnowing system for Haskell. In HPFC'95—Conf. on High Performance Functional Computing. Denver, CO, April 10–12, 1995, pp. 208–221.Google Scholar
  25. 25.
    Hammond, K. and Michaelson, G. (Eds.). Research Directions in Parallel Functional Programming. Springer, 1999.Google Scholar
  26. 26.
    Hammond, K. and Rebón, A. HaskSkel: Algorithmic skeletons for Haskell. In IFL'99—Intl. Workshop on the Implementation of Functional Languages, Vol. 1868 of LNCS. Lochem, The Netherlands, Sept. 7–10, 1999.Google Scholar
  27. 27.
    Hartel, P., Feeley, M., Alt, M., Augustsson, L., Baumann, P., Beemster, M., Chailloux, E., Flood, C., Grieskamp, W., van Groningen, J., Hammond, K., Hausman, B., Ivory, M., Jones, R., Kamperman, J., Lee, P., Leroy, X., Lins, R., Loosemore, S., Röjemo, N., Serrano, M., Talpin, J.-P., Thackray, J., Thomas, S., Walters, P., Weis, P., and Wentworth, P. Benchmarking implementations of functional languages with “Pseudoknot”, a float-intensive benchmark. J. of Functional Programming, 6(4) 1996.Google Scholar
  28. 28.
    Hernández, F., Peña, R., and Rubio, F. From GranSim to Paradise. In SFP'00—Scottish Functional Programming Workshop, Vol. 2 of Trends in Functional Programming. St. Andrews, Scotland, Jul 26–28, 2000, pp. 11–19.Google Scholar
  29. 29.
    Herrmann, C. The skeleton-based parallelization of divide-and-conquer recursions. Ph.D. thesis, University of Passau, 2000.Google Scholar
  30. 30.
    Hofmann, M. A type system for bounded space and functional in-place update. Nordic Journal of Computing, 7(4) (2000) 258–289.Google Scholar
  31. 31.
    Holmerin, J. and Lisper, B. Development of parallel algorithms in data field Haskell. In EuroPar'00—European Conf. on Parallel Processing, Vol. 1900 of LNCS. Munich, Germany, Aug. 29–Sept. 1, 2000, pp. 762–766.Google Scholar
  32. 32.
    Hudak, P. Para-functional programming. IEEE Computer, 19(8) (1986) 60–70.Google Scholar
  33. 33.
    Hughes, R. Why functional programming matters. The Computer Journal, 32(2) (1989) 98–107.Google Scholar
  34. 34.
    Impala: 2001, Impala—(IMplicitly PArallel LAnguage Application Suite). <URL: http://www.csg.>.Google Scholar
  35. 35.
    Karatsuba, A. and Ofman, Y. Multiplication of multi-digit numbers on automata. Soviet. Phys. Dokl., (7) (1962) 595–596.Google Scholar
  36. 36.
    Kelly, P. Functional Programming for Loosely-Coupled Multiprocessors. Research Monographs in Parallel and Distributed Computing. MIT Press, 1989.Google Scholar
  37. 37.
    Kesseler, M. The implementation of functional languages on parallel machines with distributed memory. Ph.D. thesis, Univ. of Nijmegen, 1996.Google Scholar
  38. 38.
    King, D., Hall, J., and Trinder, P. A strategic profiler for glasgow parallel Haskell. In IFL'98—Intl. Workshop on the Implementation of Functional Languages, Vol. 1595 of LNCS. London, UK, Sept. 9–11, 1998, pp. 88–102.Google Scholar
  39. 39.
    Kingdon, H., Lester, D., and Burn, G. The HDG-machine: A highly distributed graph-reducer for a transputer network. Computer Journal, 34(4) (1991) 290–301.Google Scholar
  40. 40.
    Klusik, U., Loogen, R., Priebe, S., and Rubio, F. Implementation skeletons in Eden—Low-effort parallel programming. In IFL'00—Intl. Workshop on the Implementation of Functional Languages, Vol. 2011 of LNCS. Aachen, Germany, Sept. 4–7, 2000, pp. 71–88.Google Scholar
  41. 41.
    Klusik, U., Peña, R., and Rubio, F. Replicated workers in Eden. In CMPP'00—Constructive Methods for Parallel Programming. Ponte di Lima, Portugal. Nova Science Books, 2001.Google Scholar
  42. 42.
    LANL: 2001, Sisal Performance Data. <URL: html>.Google Scholar
  43. 43.
    Lauer, M. Computing by homomorphic images. In Computer Algebra—Symbolic and Algebraic Computation, B. Buchberger, G.E. Collins, R. Loos, and R. Albrecht (Eds.), Springer, 1982, pp. 139–168.Google Scholar
  44. 44.
    Lester, B. The Art of Parallel Programming. Prentice-Hall, 1993.Google Scholar
  45. 45.
    Lipson, J.D. Chinese remainder and interpolation algorithms. In SYMSAM'71—Symp. on Symbolic and Algebraic Manipulation, 1971, pp. 372–391.Google Scholar
  46. 46.
    Loidl, H.-W. LinSolv: A case study in strategic parallelism. In Glasgow Workshop on Functional Programming. Ullapool, Scotland, Sept. 15–17, 1997.Google Scholar
  47. 47.
    Loidl, H.-W. Granularity in large-scale parallel functional programming. Ph.D. thesis, Dept. of Computing Science, Univ. of Glasgow, 1998.Google Scholar
  48. 48.
    Loidl, H.-W. Load balancing in a parallel graph reducer. In SFP'01—Scottish Functional Programming Workshop, Vol. 3 of Trends in Functional Programming. Stirling, Scotland, Aug. 22–24, 2001, pp. 63–74.Google Scholar
  49. 49.
    Loidl, H.-W. The virtual shared memory performance of a parallel graph reducer. In CCGrid/DSM 2002—Intl. Symp. on Cluster Computing and the Grid. Berlin, Germany, May 21–24, 2002, pp. 311–318.Google Scholar
  50. 50.
    Loidl, H.-W., Morgan, R., Trinder, P.W., Poria, S., Cooper, C., Peyton Jones, S.L., and Garigliano, R. Parallelising a large functional program rr: Keeping LOLITA busy. In IFL'97—Intl. Workshop on the Implementation of Functional Languages 1997, Vol. 1467 of LNCS. St Andrews, Scotland, Sept. 10–12, 1997, pp. 198–213.Google Scholar
  51. 51.
    Loidl, H.-W., Scaife, N., Michaelson, G., and Trinder, P. Implementation designs for parallel functional languages. In: PPDP'03—Intl. Conf. on Principles and Practice of Declarative Programming. Uppsala, Sweden, Aug 27–29, 2003, Submitted.Google Scholar
  52. 52.
    Loidl, H.-W., Trinder, P., and Butz, C. Tuning task granularity and data locality of data parallel GpH programs. Parallel Processing Letters, 11(4) (2001) 471–486.Google Scholar
  53. 53.
    Loidl, H.-W., Trinder, P., Hammond, K., Junaidu, S., Morgan, R., and Peyton Jones, S. Engineering parallel symbolic programs in GPH. Concurrency—Practice and Experience, 11(12) (1999) 701–752.Google Scholar
  54. 54.
    Loogen, R. Programming language constructs. In Research Directions in Parallel Functional Programming. Springer, 1999, pp. 63–91.Google Scholar
  55. 55.
    Michaelson, G., Scaife, N., Bristow, P., and King, P. Nested algorithmic skeletons from higher order functions. Parallel Algorithms and Applications, 16 (2001) 181–206. Special Issue on High Level Models and Languages for Parallel Processing.Google Scholar
  56. 56.
    Milner, R., Tofte, M., Harper, R., and MacQueen, D. The Definition of Standard ML (Revised). Cambridge, MA: MIT Press, 1997.Google Scholar
  57. 57.
    Mirani, R. High-level abstractions for parallel functional programming. Ph.D. thesis, Yale University, 1996.Google Scholar
  58. 58.
    Mirani, R. and Hudak, P. First-class schedules and virtual maps. In FPCA'95—Conf. on Functional Programming Languages and Computer Architecture. La Jolla, CA, June 26–28, 1995, pp. 78–85.Google Scholar
  59. 59.
    Mohr, E., Kranz, D., and Halstead Jr., R. Lazy task creation: A technique for increasing the granularity of parallel programs. IEEE Transactions on Parallel and Distributed Systems, 2(3) (1991) 264–280.Google Scholar
  60. 60.
    MPI: MPI-2: extensions to the message-passing interface. Technical report, Univ. of Tennessee, Knoxville, 1997.Google Scholar
  61. 61.
    Nikhil, R. and Arvind. Implicit Parallel Programming in pH. Morgan Kaufmann Publishers, 2001.Google Scholar
  62. 62.
    Nöcker, E., Smetsers, J., van Eekelen, M., and Plasmeijer, M. Concurrent clean. In PARLE'91—Parallel Architectures and Languages Europe, Vol. 505 of LNCS. Veldhoven, The Netherlands, 1991, pp. 202–219.Google Scholar
  63. 63.
    Pelagatti, S. Task and data parallelism in P3L. In Patterns and Skeletons forParallel and Distributed Computing F. Rabhi and S. Gorlatch (Eds.), Springer, 2002.Google Scholar
  64. 64.
    Peña, R. and Rubio, F. Parallel functional programming at two levels of abstraction. In PPDP'01—Intl. Conf. on Principles and Practice of Declarative Programming. Firenze, Italy, Sept. 5–7, 2001, pp. 187–198.Google Scholar
  65. 65.
    Pepper, P. Deductive derivation of parallel programs. In Parallel Algorithm Derivation and Program Transformation. Kluwer Academic Publishers, 1993, pp. 1–53.Google Scholar
  66. 66.
    Peyton Jones, S., Hall, C., Hammond, K., Partain, W., and Wadler, P. The glasgow Haskell compiler: A technical overview. In Joint Framework for Information Technology Technical Conference. Keele, UK, 1993, pp. 249–257.Google Scholar
  67. 67.
    Peyton Jones, S., Hughes, J., Augustsson, L., Barton, D., Boutel, B., Burton, W., Fasel, J., Hammond, K., Hinze, R., Hudak, P., Johnsson, T., Jones, M., Launchbury, J., Meijer, E., Peterson, J., Reid, A., Runciman, C., and Wadler, P. Haskell 98: A non-strict, purely functional language, 1999. Available at <URL:>.Google Scholar
  68. 68.
    Plasmeijer, R., van Eekelen, M., Pil, M., and Serrarens, P. Parallel and distributed programming in concurrent clean. In Research Directions in Parallel Functional Programming. Springer, 1999, pp. 323–338.Google Scholar
  69. 69.
    Press, W., Teukolsky, S., Vetterling, W., and Flannery, B. Numerical Recipes in C: The Art of Scientific Computing. Chapt. LU Decomposition and Its Applications. Cambridge University Press, 2nd edition, 1992.Google Scholar
  70. 70.
    PVM: Parallel virtual machine reference manual, version 3.2. Univ. of Tennessee, 1993.Google Scholar
  71. 71.
    Quinn, M. Parallel Computing. McGraw-Hill, 1994.Google Scholar
  72. 72.
    Ridge, D., Becker, D., Merkey, P., and Sterling, T. Beowulf: Harnessing the power of parallelism in a pile-of-PCs. In IEEE Aerospace Conference, 1997, pp. 79–91.Google Scholar
  73. 73.
    Rubio, F. Programación Funcional Paralela Eficiente en Eden. Ph.D. thesis, Universidad Complutense de Madrid, Spain, 2001, in Spanish.Google Scholar
  74. 74.
    Sansom, P. and Peyton Jones, S. Generational garbage collection for Haskell, In FPCA'93—Functional Programming Languages and Computer Architecture. Copenhagen, Denmark, June 9–11, 1993, pp. 106–116.Google Scholar
  75. 75.
    Sansom, P. and Peyton Jones, S. Time and space profiling for non-strict, higher-order functional languages. In POPL'95—Symp. on Principles of Programming Languages. San Francisco, CA, Jan. 23–25, 1995, pp. 355–366.Google Scholar
  76. 76.
    Scaife, N., Michaelson, G., and Horiguchi, S. Comparative cross-platform performance results from a parallelizing SML compiler. In IFL'01—Intl. Workshop on the Implementation of Functional Languages, Vol. 2312 of LNCS. Stockholm, Sweden, Sept. 24–26, 2001, pp. 138–154.Google Scholar
  77. 77.
    Serot, J. Tagged-token data-flow for skeletons. Parallel Processing Letters 11(4) (2001) 377–392.Google Scholar
  78. 78.
    Taylor, F. Parallel functional programming by partitioning. Ph.D. thesis, Univ. of London, 1996.Google Scholar
  79. 79.
    Trinder, P., Hammond, K., Loidl, H.-W., and Peyton Jones, S. Algorithm + strategy = parallelism. J. of Functional Programming 8(1) (1998) 23–60.Google Scholar
  80. 80.
    Trinder, P., Hammond, K., Mattson Jr., J., Partridge, A., and Peyton Jones, S. GUM: A portable parallel implementation of Haskell. In PLDI'96—Programming Language Design and Implementation. Philadephia, PA, May 21–24, 1996, pp. 78–88.Google Scholar
  81. 81.
    Trinder, P., Loidl, H.-W., and Pointon, R. Parallel and distributed Haskells. J. of Functional Programming 12(4/5) (2002) 469–510.Google Scholar
  82. 82.
    WWW-GPH. Glasgow Parallel Haskell, 2001. WWW page. <URL:>.Google Scholar

Copyright information

© Kluwer Academic Publishers 2003

Authors and Affiliations

  • H.-W. Loidl
    • 1
  • F. Rubio
    • 2
  • N. Scaife
    • 3
  • K. Hammond
    • 4
  • S. Horiguchi
    • 3
  • U. Klusik
    • 5
  • R. Loogen
    • 5
  • G.J. Michaelson
    • 1
  • R. Peña
    • 2
  • S. Priebe
    • 5
  • Á.J. Rebón
    • 6
  • P.W. Trinder
    • 1
  1. 1.School of Mathematical and Computer SciencesHeriot-Watt UniversityEdinburghScotland
  2. 2.Dpto. Sistemas Informáticos y ProgramaciónUniversidad Complutense de MadridMadridSpain
  3. 3.Japan Advanced Institute for Science and TechnologyTatsunokuchi, Nomigun, Ishikawa
  4. 4.School of Computer ScienceUniversity of St. AndrewsScotland
  5. 5.Fachbereich Mathematik und InformatikPhilipps-Universität MarburgMarburgGermany
  6. 6.School of Computer ScienceUniversity of St. AndrewsScotland

Personalised recommendations