Synthesizing MPI Implementations from Functional Data-Parallel Programs



Distributed memory architectures such as Linux clusters have become increasingly common but remain difficult to program. We target this problem and present a novel technique to automatically generate data distribution plans, and subsequently MPI implementations in C++, from programs written in a functional core language. The main novelty of our approach is that we support distributed arrays, maps, and lists in the same framework, rather than just arrays. We formalize distributed data layouts as types, which are then used both to search (via type inference) for optimal data distribution plans and to generate the MPI implementations. We introduce the core language and explain our formalization of distributed data layouts. We describe how we search for data distribution plans using an adaptation of the Damas–Milner type inference algorithm, and how we generate MPI implementations in C++ from such plans.


Data parallelism Data distribution Type inference Code generation MPI 


  1. 1.
    Anderson, J.M., Lam, M. S.: Global optimizations for parallelism and locality on scalable parallel machines. In: PLDI ’93, pp. 112–125 (1993)Google Scholar
  2. 2.
    Aubrey-Jones, T.: Synthesizing imperative distributed-memory implementations from functional data-parallel programs. PhD thesis, submitted at University of Southampton, UK (2015)Google Scholar
  3. 3.
    Bixby, R.E., Kennedy, K., Kremer, U.: Automatic data layout using 0–1 integer programming. In: IFIP Trans ’94, pp. 111–122 (1994)Google Scholar
  4. 4.
    Blelloch, G., Hardwick, J., Chatterjee, S., Sipelstein, J., Zagha, M.: Implementation of a portable nested data-parallel language. In: PPOPP ’93, pp. 102–111 (1993)Google Scholar
  5. 5.
    Bondhugula, U.: Compiling affine loop nests for distributed-memory parallel architectures. In: SC ’13, pp. 1–12 (2013)Google Scholar
  6. 6.
    Bu, Y., Howe, B., Balazinska, M., Ernst, M. D.: Haloop: efficient iterative data processing on large clusters. In: PVLDB ’10, pp. 285–296 (2010)Google Scholar
  7. 7.
    Buck, J.B., Watkins, N., LeFevre, J., Ioannidou, K., Maltzahn, C., Polyzotis, N., Brandt, S.: SciHadoop: array-based query processing in Hadoop. In: SC ’11, pp. 1–11 (2011)Google Scholar
  8. 8.
    Chamberlain, B., Callahan, D., Zima, H.: Parallel programmability and the Chapel language. In: IJHPCA ’07, pp. 291–312 (2007)Google Scholar
  9. 9.
    Chamberlin, D., Boyce, R.: Sequel: A structured english query language. In: SIGFIDET ’74, pp. 249–264 (1974)Google Scholar
  10. 10.
    Charles, P., Grothoff, C., Saraswat, V., Donawa, C., Kielstra, A., Ebcioglu, K., von Praun, C., Sarkar, V.: X10: an object-oriented approach to non-uniform cluster computing. In: OOPSLA ’05, pp. 519–538 (2005)Google Scholar
  11. 11.
    Chen, R., Chen, H., Zang, B.: Tiled-mapreduce: optimizing resource usages of data-parallel applications on multicore with tiling. In: PACT ’10, pp. 523–534 (2010)Google Scholar
  12. 12.
    Cole, M.: Algorithmic Skeletons: Structured Management of Parallel Computation. MIT Press, Cambridge (1991)MATHGoogle Scholar
  13. 13.
    Chatterjee, S., Gilbert, J.R., Schreiber, R., Teng, S.-H.: Automatic array alignment in data-parallel programs. In: POPL ’93, pp. 16–28 (1993)Google Scholar
  14. 14.
    Damas, L., Milner, R.: Principal type-schemes for functional programs. In: POPL ’82, pp. 207–212 (1982)Google Scholar
  15. 15.
    Dean, J., Ghemawat, S.: Mapreduce: simplified data processing on large clusters. In: OSDI ’04 , USENIX (2004)Google Scholar
  16. 16.
    Duffy, J., Essey, E.: Parallel LINQ: running queries on multi-core processors. In: MSDN Magazine ’07, pp. 70–78 (2007)Google Scholar
  17. 17.
    Ekanayake, J., Pallickara, S., Fox, G.: Mapreduce for data intensive scientific analyses. In: eScience ’08, pp. 277–284 (2008)Google Scholar
  18. 18.
    Ekanayake, J., Gunarathne, T., Fox, G., Balkir, A., Poulain, C., Araujo, N., Barga, R.: DryadLINQ for scientific analyses. In: e-Science ’09, pp. 329–336 (2009)Google Scholar
  19. 19.
    Ekanayake, J., Li, H., Zhang, B., Gunarathne, T., Bae, S.-H., Qiu, J., Fox, G.: Twister: a runtime for iterative mapreduce. In: HPDC ’10, pp. 810–818 (2010)Google Scholar
  20. 20.
    Fadika, Z., Dede, E., Govindaraju, M., Ramakrishnan, L.: Mariane: Mapreduce implementation adapted for hpc environments. In: GRID ’11, pp. 82–89 (2011)Google Scholar
  21. 21.
    Feo, J., Cann, D., Oldehoeft, R.: A report on the Sisal language project. JPDC 10(4), 349–366 (1990)Google Scholar
  22. 22.
    Grelck, C.: Shared memory multiprocessor support for functional array processing in SAC. JFP 15(03), 353–401 (2005)MATHGoogle Scholar
  23. 23.
    Isard, M., Yu, Y.: Distributed data-parallel computing using a high-level programming language. In: SIGMOD ’09, pp. 987–994 (2009)Google Scholar
  24. 24.
    Kennedy, K., Kremer, U.: Automatic data layout for high performance fortran. In: Supercomputing ’95 (1995)Google Scholar
  25. 25.
    Lin, C., Snyder, L.: Zpl: an array sublanguage. In: LCPC ’94, LNCS 768, pp. 96–114 (1994)Google Scholar
  26. 26.
    Loveman, D.: High performance fortran. PDS 1(1), 25–42 (1993)Google Scholar
  27. 27.
    Olston, C., Reed, B., Srivastava, U., Kumar, R., Tomkins, A.: Pig latin: a not-so-foreign language for data processing. In: SIGMOD ’08, pp. 1099–1110 (2008)Google Scholar
  28. 28.
    Papadomanolakis, S., Ailamaki, A.: Autopart: automating schema design for large scientific databases using data partitioning. In: SSDBM ’04, pp. 383–392 (2004)Google Scholar
  29. 29.
    Peyton Jones, S.: Harnessing the multicores: nested data parallelism in haskell. In: APLAS ’08, LNCS 5356, pp. 138–138 (2008)Google Scholar
  30. 30.
    Pierce, B.: Types and Programming Languages. MIT Press, Cambridge (2002)MATHGoogle Scholar
  31. 31.
    Ranger, C., Raghuraman, R., Penmetsa, A., Bradski, G., Kozyrakis, C.: Evaluating mapreduce for multi-core and multiprocessor systems. In: HPCA ’07, pp. 13–24 (2007)Google Scholar
  32. 32.
    Reichenbach, C., Smaragdakis, Y., Immerman, N.: PQL: a purely-declarative java extension for parallel programming. In: ECOOP ’12, LNCS 7313, pp. 53–78 (2012)Google Scholar
  33. 33.
    Robinson, J.A.: A machine-oriented logic based on the resolution principle. J. ACM 12(1), 23–41 (1965)MathSciNetCrossRefMATHGoogle Scholar
  34. 34.
    Saidani, T., Falcou, J., Tadonki, C., Lacassagne, L., Etiemble, D.: Algorithmic skeletons within an embedded domain specific language for the cell processor. In: PACT ’09, pp. 67–76 (2009)Google Scholar
  35. 35.
    Sarkar, V., Cann, D.: Posc - a partitioning and optimizing sisal compiler. In: ICS ’90, pp. 148–164 (1990)Google Scholar
  36. 36.
    Sarkar, V., Hennessy, J.: Compile-time partitioning and scheduling of parallel programs. In: CC ’86, pp. 17–26 (1986)Google Scholar
  37. 37.
    Scaife, N., Horiguchi, S., Michaelson, G., Bristow, P.: A parallel sml compiler based on algorithmic skeletons. JFP 15, 615–650 (2005)MATHGoogle Scholar
  38. 38.
    Seinstra, F., Koelma, D., Bagdanov, A.: Finite state machine-based optimization of data parallel regular domain problems applied in low-level image processing. TPDS 15(10), 865–877 (2004)Google Scholar
  39. 39.
    Selinger, P.G., Astrahan, M.M., Chamberlin, D.D., Lorie, R.A., Price, T.G.: Access path selection in a relational database management system. In: SIGMOD ’79, pp. 23–34 (1979)Google Scholar
  40. 40.
    Sinkarovs, A., Scholz, S.-B.: Semantics-preserving data layout transformations for improved vectorisation. In: FHPC ’13, pp. 59–70 (2013)Google Scholar
  41. 41.
    Snir, M., Otto, S., Huss-Lederman, S., Walker, D., Dongarra, J.: MPI—The Complete Reference. The MIT Press, Cambridge (1996)Google Scholar
  42. 42.
    Weiland, M.: Chapel, Fortress and X10: novel languages for hpc. In: Technical report, HPCx Consortium, University of Edinburgh, Oct 2007Google Scholar
  43. 43.
    White, T.: Hadoop: The Definitive Guide (2010)Google Scholar
  44. 44.
    Xi, H.: Dependent ML: an approach to practical programming with dependent types. JFP 17, 215–286 (2007)MATHGoogle Scholar
  45. 45.
    Yang, H.-C., Dasdan, A., Hsiao, R.-L., Parker, D.S.: Map-reduce-merge: simplified relational data processing on large clusters. In: SIGMOD ’07, pp. 1029–1040 (2007)Google Scholar

Copyright information

© Springer Science+Business Media New York 2015

Authors and Affiliations

  1. 1.University of SouthamptonSouthamptonUK
  2. 2.University of StellenboschStellenboschSouth Africa

Personalised recommendations