GridFOR: A Domain Specific Language for Parallel Grid-Based Applications

Article
  • 172 Downloads

Abstract

To ease the programming burden and to make parallel programs more maintainable, computational scientists and engineers currently have the options to use software libraries, templates, and general purpose language extensions to compose their application programs. These existing options, unfortunately, have considerable limitations with compatibility, expressive power and delivered performance. To address these issues, we design a domain specific language, GridFOR, for computational problems defined over regular geometric grids. This language allows the programmer to first implement an algorithm on simple data structures, as commonly illustrated in textbooks or papers. The programmer then specifies transformations to extend the algorithm for complex data structures required by the target applications. We build a compiler to automatically translate a GridFOR program to a parallel Fortran version with Message Passing Interface calls. Several optimization techniques are implemented in our compiler to enhance the program speed.

Keywords

Domain specific languages Programming models  Program transformation Compiler analysis and optimization 

References

  1. 1.
  2. 2.
    Anderson, E., Bai, Z., Dongarra, J., Greenbaum, A., McKenney, A., Du Croz, J., Hammerling, S., Demmel, J., Bischof, C., and Sorensen, D.: LAPACK: A portable linear algebra library for high-performance computers. In: Supercomputing ’90: Proceedings of the 1990 ACM/IEEE Conference on Supercomputing, pp. 2–11 (1990)Google Scholar
  3. 3.
    Bogey, C., de Cacqueray, N., Bailly, C.: Finite differences for coarse azimuthal discretization and for reduction of effective resolution near origin of cylindrical flow equations. J. Comput. Phys. 230, 1134–1146 (2011)MathSciNetCrossRefMATHGoogle Scholar
  4. 4.
    Brickner, R.G., George, W., Johnsson, S.L., Ruttenberg, A.: A stencil compiler for the connection machine models CM-2/200. In: Proceedings of the Fourth Workshop on Compilers for Parallel Computers (1993)Google Scholar
  5. 5.
    Christen, M., Schenk, O., and Burkhart, H.: Patus: A code generation and auto-tuning framework for parallel stencil computations. In: Proceedings of IPDPS 2011 (2011)Google Scholar
  6. 6.
    Cole, M.: Algorithmic Skeletons: Structured Management of Parallel Computation. MIT Press, Cambridge (1991)MATHGoogle Scholar
  7. 7.
    DeVito, Z., Joubert, N., Palacios, S.O.F., Medina, M., Barrientos, M., Elsen, E., Ham, F., Aiken, A., Duraisamy, K., Darve, E., Alonso, J., Hanrahan, P.: Liszt: a domain specific language for building portable mesh-based pde solvers. In: Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (2011)Google Scholar
  8. 8.
    Frigo, M., and Johnson, S.G.: FFTW: An adaptive software architecture for the FFT. In: Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, volume 3, pp. 1381–1384 (1998)Google Scholar
  9. 9.
    González-Vélez, H., Leyton, M.: A survey of algorithmic skeleton frameworks: high-level structured parallel programming enablers. Softw. Pract. Exp. 40(12), 1135–1160 (2010)CrossRefGoogle Scholar
  10. 10.
    Grelck, C.: Single Assignment c (sac) High Productivity Meets High Performance. In: Central European Functional Programming School, pp. 207–278. Springer, Berlin (2012)CrossRefGoogle Scholar
  11. 11.
    Grelck, C., and Penczek, F.: Design and implementation of CAOS: an implicitly parallel language for the high-performance simulation of cellular automata. In: Salcido, A. (ed.) Cellular Automata: Simplicity Behind Complexity, pp. 545–566 (2011)Google Scholar
  12. 12.
    Hasert, M., Klimach, H., and Roller, S.: Caf versus mpi—applicability of coarray fortran to a flow solver. In: Proceedings of the 18th European MPI Users’ Group conference on Recent advances in the message passing interface, EuroMPI’11, pp. 228–236 (2011)Google Scholar
  13. 13.
    Loveman, D.B.: High performance fortran. Parallel & Distrib. Technol. Syst. & Appl. IEEE 1, 25–42 (1993)CrossRefGoogle Scholar
  14. 14.
    Martha, C.S.: Toward high-fidelity subsonic jet noise prediction using petascale supercomputers. Ph.D. dissertation, School of Aeronautics and Astronautics, Purdue University (2013)Google Scholar
  15. 15.
    Maruyama, N., Nomura, T., Sato, K., Matsuoka, S.: Physis: An implicitly paralell programming model for stencil computations on large-scale gpu-accelerated supercomputers. In: Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (2011)Google Scholar
  16. 16.
    Murai, H., and Sato, M.: An efficient implementation of stencil communication for the xcalablemp pgas parallel programming language. In: 7th International Conference on PGAS Programming Models, pp. 142 (2013)Google Scholar
  17. 17.
    Numrich, R.W., Reid, J.: Co-array fortran for parallel programming. ACM Sigplan Fortran Forum 17(2), 1–31 (1998)CrossRefGoogle Scholar
  18. 18.
    Orchard, D. A., Bolingbroke, M., and Mycroft, A.: Ypnos: Declarative, parallel structured grid programming. In: DAMP’10: Proceedings of the 5th ACM Sigplan workshop on Declarative aspects of multicore programming, pp. 15–24Google Scholar
  19. 19.
    Parr, T.J., Quong, R.W.: Antlr: a predicated-ll(k) parser generator. Softw. Pract. Exp. 25(7), 789–810 (1995)CrossRefGoogle Scholar
  20. 20.
    Polizzi, E., Sameh, A.H.: A parallel hybrid banded system solver : the spike algorithm. Parallel Comput. 32(2), 177–194 (2006)MathSciNetCrossRefGoogle Scholar
  21. 21.
    Puschel, M., Moura, J.M., Johnson, J.R., Padua, D., Veloso, M.M., Singer, B.W., Xiong, J., Franchetti, F., Gacic, A., Voronenko, Y., et al.: Spiral: code generation for dsp transforms. Proceedings of the IEEE 93(2), pp. 232–275 (2005)Google Scholar
  22. 22.
    Roth, G., Mellor-Crummey, J., Kennedy, K., and Brickner, R.G.: Compiling stencils in high performance fortran. In: Proceedings of the International Conference on Supercomputing (1997)Google Scholar
  23. 23.
    Seinstra, F.J., Koelma, D., Bagdanov, A.D.: Finite state machine-based optimization of data parallel regular domain problems applied in low-level image processing. Parallel Distrib. Syst. IEEE Trans. on 15(10), 865–877 (2004)CrossRefGoogle Scholar
  24. 24.
    Situ, Y., Liu, L., Martha, C., Louis, M., Li, Z., Sameh, A.H., Blasidell, G.A., and Lyrintzis, A.S.: Reducing communication overhead in large eddy simulation for jet engine noise. In: Cluster Computing, 2010 IEEE International Conference, pp. 255–264 (2010)Google Scholar
  25. 25.
    Situ, Y., Wang, Y., and Li, Z.: Automated rapid prototyping of regular grid-based numerical applications using generalized elemental subroutines. In: IEEE 27th International Symposium on Parallel & Distributed Processing (2013)Google Scholar
  26. 26.
    Tang, Y., Chowdhury, R., Kuszmaul, B., Luk, C., and Leiserson, C.: The pochoir stencil compiler. In: Proceedings of the 23rd ACM symposium on Parallelism in algorithms and architectures, pp. 117–128 (2011)Google Scholar
  27. 27.
    Unat, D., Cai, X., and Baden, S.B.: Mint: realizing cuda performance in 3d stencil methods with annotated c. In: Proceedings of the International Conference on Supercomputing (2011)Google Scholar

Copyright information

© Springer Science+Business Media New York 2015

Authors and Affiliations

  1. 1.Department of Computer SciencePurdue UniversityWest LafayetteUSA

Personalised recommendations