A Unifying Framework for Parallel Computing

  • Victor EijkhoutEmail author
Part of the Lecture Notes in Electrical Engineering book series (LNEE, volume 247)


We propose a new theoretical model for parallelism. The model is explictly based on data and work distributions, a feature missing from other theoretical models. The major theoretic result is that data movement can then be derived by formal reasoning. While the model has an immediate interpretation in distributed memory parallelism, we show that it can also accomodate shared memory and hybrid architectures such as clusters with accelerators.The model gives rise in a natural way to objects appearing in widely different parallel programming systems such as the PETSc library or the Quark task scheduler. Thus we argue that the model offers the prospect of a high productivity programming system that can be compiled down to proven high-performance environments.


Dataflow Data distribution High performance computing Hybrid architecture Message passing Parallel programming 


  1. 1.
    Adiga AK, Browne JC (1986) A graph model for parallel computations expressed in the computation structures language. In: ICPP, pp 880–886Google Scholar
  2. 2.
    Barnes J, Hut P (1986) A hierarchical O(N log N) force-calculation algorithm. Nature 324:446–449. Google Scholar
  3. 3.
    Chan E, Quintana-Ortí ES, Quintana-Ortí G, van de Geijn R (2007) SuperMatrix out-of-order scheduling of matrix operations for SMP and multi-core architectures. In: SPAA ’07: Proceedings of the 19th ACM symposium on parallelism in algorithms and architectures, San Diego, CA, USA, pp 116–125.Google Scholar
  4. 4.
    Chapel programming language homepage.
  5. 5.
    Cray Research: Cray T3E\(^{TM}\) Fortran optimization guide.
  6. 6.
    Eijkhout V (2012) A unified approach to parallel programming. In: Ao S, Douglas C, Grundfest W, Burgstone J (eds) Lecture Notes in engineering and computer science: proceedings of the world congress on engineering and computer science 2012, WCECS 2012, 24–26 October, 2012, San Francisco, USA, pp 78–83. Newswood Limited, International Association of Engineers. ISBN (vol I): 978-988-19251-6-9, ISBN (vol II): 978-988-19252-4-4, ISSN: 2078-0958 (Print), ISSN: 2078-0966 (Online)Google Scholar
  7. 7.
    Gao G, Sterling T, Stevens R, Hereld M, Zhu W (2007) ParalleX: a study of a new parallel computation model. In: Parallel and distributed processing symposium, 2007. IPDPS 2007. IEEE International, pp 1–6. doi: 10.1109/IPDPS.2007.370484.
  8. 8.
    Greengard L, Rokhlin V (1987) A fast algorithm for particle simulations. J Comput Phys 73:325MathSciNetCrossRefzbMATHGoogle Scholar
  9. 9.
    Gropp W, Lusk E, Skjellum A (1994) Using MPI. The MIT Press, CambridgeGoogle Scholar
  10. 10.
    Gropp WD, Smith BF (1994) Scalable, extensible, and portable numerical libraries. In: Proceedings of the scalable parallel libraries conference, IEEE pp 87–93.Google Scholar
  11. 11.
    Heroux MA, Bartlett RA, Howle VE, Hoekstra RJ, Hu JJ, Kolda TG, Lehoucq RB, Long KR, Pawlowski RP, Phipps ET, Salinger AG, Thornquist HK, Tuminaro RS, Willenbring JM, Williams A, Stanley KS (2005) An overview of the trilinos project. ACM Trans Math Softw 31(3):397–423. Google Scholar
  12. 12.
    Kale LV, Krishnan S (1996) Charm++: Parallel programming with message-driven objects. In: Wilson GV, Lu P (eds) Parallel programming using C++, MIT Press, Cambridge, pp. 175–213Google Scholar
  13. 13.
    Karp RM, Miller RE (1966) Properties of a model for parallel computations: Determinacy, termination, queueing. SIAM J Appl Math 14:1390–1411MathSciNetCrossRefzbMATHGoogle Scholar
  14. 14.
    Katzenelson J (1989) Computational structure of the n-body problem. SIAM J Sci Stat Comput 10:787–815MathSciNetCrossRefzbMATHGoogle Scholar
  15. 15.
    Kulkarni M, Pingali K, Walter B, Ramanarayanan G, Bala K, Chew LP (2007) Optimistic parallelism requires abstractions. SIGPLAN Not. (Proceedings of PLDI) 42(6):211–222.
  16. 16.
    Lublinerman R, Chaudhuri S, Cerny P (2009) Parallel programming with object assemblies. In: International conference on object oriented programming, systems, languages and applications (OOPSLA)Google Scholar
  17. 17.
    Newton P, Browne JC (1992) The code 2.0 graphical parallel programming language. In: Proceedings of the 6th international conference on supercomputing, ICS ’92, pp 167–177. ACM, New York, NY, USA. doi:10.1145/143369.143405.
  18. 18.
    Nieplocha J, Harrison R, Littlefield R (1996) Global arrays: A nonuniform memory access programming model for high-performance computers. J Supercomput 10:197–220CrossRefGoogle Scholar
  19. 19.
    Poulson J, Marker B, Hammond JR, van de Geijn R Elemental: a new framework for distributed memory dense matrix computations. ACM Trans Math Softw SubmittedGoogle Scholar
  20. 20.
    Quintana-Ortí G, Quintana-Ortí ES, van de Geijn RA, Van Zee FG, Chan E (2009) Programming matrix algorithms-by-blocks for thread-level parallelism. ACM Trans Math Softw 36(3):14:1–14:26.
  21. 21.
    Salmon JK, Warren MS, Winckelmans GS (1986) Fast parallel tree codes for gravitational and fluid dynamical n-body problems. Int J Supercomput Appl 8:129–142CrossRefGoogle Scholar
  22. 22.
    Sussman A, Saltz J, Das R, Gupta S, Mavriplis D, Ponnusamy R (1992) Parti primitives for unstructured and block structured problemsGoogle Scholar
  23. 23.
    Valiant LG (1990) A bridging model for parallel computation. Commun ACM 33:103–111. doi: Google Scholar
  24. 24.
    YarKhan A, Kurzak J, Dongarra J (2011) QUARK users’ guide: queueing and runtime for kernels. Technical Report ICL-UT-11-02, University of Tennessee Innovative Computing LaboratoryGoogle Scholar

Copyright information

© Springer Science+Business Media Dordrecht 2014

Authors and Affiliations

  1. 1.Texas Advanced Computing Center (TACC)The University of Texas at AustinAustinUSA

Personalised recommendations