Performance comparison of strategies for static mapping of parallel programs

  • M. A. Senar
  • A. Ripoll
  • A. Cortés
  • E. Luque
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1225)


We address the problem of efficient strategies for mapping arbitrary parallel programs onto distributed memory message-passing parallel computers. An efficient task assignment strategy based on two phases (task clustering and task reassignment) is proposed. This strategy is suitable for applications which could be partitioned into parallel executable tasks and its design is a trade-off between solution quality and computational complexity. It is evaluated by comparison with some representative heuristics from the literature that cover the range of different complexity categories: two simple greedy algorithms (Largest Processing Time First and Largest Global Cost First) with low complexity, two iterative heuristics (Simulated Annealing and Tabu Search) with the highest complexity and a mixed heuristic (Even Distribution and Task Reassignment) with an intermediate complexity between the other two. It is shown to be very effective for computations whose communication topology matches some well-known regular graph families such as trees, rings and meshes, as well as for arbitrary computations with irregular communications patterns. Its solution quality is better than that produced by the other mixed heuristic and as good as (and sometimes better than) that produced by Simulated Annealing.


Mapping task assignment static load balancing distributed-memory parallel machines program graphs heuristics 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. [1]
    M. G. Norman & P. Thanish, “Models of machines and computation for mapping in multicomputer”. ACM Computer Surveys, 25 (3) (1993), pp. 263–302.Google Scholar
  2. [2]
    O. Krämer and H. Mühlenbein, “Mapping strategies in message-based multiprocessor systems”, Parallel Computing, Vol. 9 (1988/89), pp. 213–225.Google Scholar
  3. [3]
    H.A-Choi and B. Narahari, “Efficient algorithms for mapping and partitioning a class of parallel computations”, J. Parallel Distributed Computing. 19(4) (1993), pp. 349–363.Google Scholar
  4. [4]
    C. Lee and L. Bic, “Comparing quenching and slow simulated annealing on the mapping problem”, Proc. 3rd. Annual Parallel Proc. Symp. (1989), pp. 671–685.Google Scholar
  5. [5]
    M.R. Garey and D.S. Johnson, “Computer and Intractability — A Guide to the Theory of NP-Completeness”, (Freeman, New York, 1979).Google Scholar
  6. [6]
    P. Bouvry, J. Chassin de Kergommeaux & D. Trystram, “Efficient solutions for mapping parallel programs”. Proc. of EuroPar'95. Springer-Verlag, August (1995), pp. 379–390.Google Scholar
  7. [7]
    V. Chaudhary and J. Aggarwal, “A generalized scheme for mapping parallel algorithms”, IEEE Trans. on Parall. and Distrib. Systems, Vol. 4, no. 3 (1993), pp. 328–346.Google Scholar
  8. [8]
    V. M. Lo, “Algorithms for static assignment and symmetric contraction in distributed computing systems”, Proc IEEE Int. Conf. on Parallel Proc., (1988), pp. 239–244.Google Scholar
  9. [9]
    P. Sadayappan, F.Ercal and J. Ramanujam, “Cluster partitioning approaches to mapping parallel problems on to a hypercube”, Parallel Computing, 13 (1) (1990), pp. 1–6.Google Scholar
  10. [10]
    P. Sadayappan and F. Ercal, “Nearest-neighbor mapping of finite element graphs onto processor meshes”, IEEE Trans. Comput, Vol 36, no. 12 (1990), pp.1408–1424.Google Scholar
  11. [11]
    S. Selvakumar & C. Silva Ram Murthy, “An efficient heuristic algorithm for mapping parallel programs onto multicomputers”. Microprocessing and Microprogramming 36 (1992/1993) pp 83–92.Google Scholar
  12. [12]
    Shen Shen Wu & D. Sweeting, “Heuristic algorithms for task assignment and scheduling in a processor network”. Parallel Computing 20 (1994), pp. 1–14.Google Scholar
  13. [13]
    T. Agerwala et alter, “SP2 system architecture”, IBM Systems Journal, Vol. 34, No. 2 (1995), pp. 152–184.Google Scholar
  14. [14]
    M. D. May, P. W. Thompson & P. H. Welch, “Networks, Routers & Transputers”, IOS Press (1993).Google Scholar
  15. [15]
    A. Gerasoulis and T. Yang, “A comparison of heuristics for scheduling directed acyclic graphs on multiprocessors”, J. of Par. and Distrib. Comput., Vol. 16 (1992), pp. 276–291.Google Scholar
  16. [16]
    M. A. Senar, A. Cortés, A. Ripoll and E. Luque, “A clustering-reassigning strategy for mapping parallel programs”, Proc. of 8th IASTED Int. Conf. on Par. and Distrib. Computing and Systems, Acta Press, Anaheim, (1996), pp. 166–170.Google Scholar
  17. [17]
    M. A. Senar, A. Ripoll, A. Cortés and E. Luque, “An efficient clustering-based approach for mapping parallel programs”, Proc. of 5th Euromicro Workshop on Parallel and Distributed Processing, London, January, (1997), pp. 407–412.Google Scholar
  18. [18]
    R.Graham, “Bounds on multiprocessing timing anomalies”, SIAM, J. Appl. Math., Vol. 17, No. 2, (1969), pp. 416–429.Google Scholar
  19. [19]
    E. Luque et alter, “PSEE: a tool for parallel systems learning”, Computers and Artificial Intelligence, Vol. 14, No. 1 (1996), pp. 319–339.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1997

Authors and Affiliations

  • M. A. Senar
    • 1
  • A. Ripoll
    • 1
  • A. Cortés
    • 1
  • E. Luque
    • 1
  1. 1.Departament d'Informàtica Unitat d'Arquitectura d'Ordinadors i Sistemes OperatiusUniversitat Autònoma de BarcelonaBellaterra (Barcelona)Spain

Personalised recommendations