Skip to main content

Dynamic Load Balancing in MPI Jobs

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4759))

Abstract

There are at least three dimensions of overhead to be considered by any parallel job scheduling algorithm: load balancing, synchronization, and communication overhead. In this work we first study several heuristics to choose the next to run from a global processes queue. After that we present a mechanism to decide at runtime weather to apply Local process queue per processor or Global processes queue per job, depending on the load balancing degree of the job, without any previous knowledge of it.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Anderson, T.E., Lazowska, E.D., Levy, H.M.: The performance Implications of Thread Management Alternatives for Shared Memory Multiprocessors. IEEE Trans. on Comp. 38(12), 1631–1644 (1989)

    Article  Google Scholar 

  2. Arpaci-Dusseau, A., Culler, D.: Implicit Co-Scheduling: Coordinated Scheduling with Implicit Information in Distributed Systems. ACM Trans. Comp. Sys. 19(3), 283–331 (2001)

    Article  Google Scholar 

  3. Bailey, D., Harris, T., Saphir, W., Wijngaart, R., Woo, A., Yarrow, M.: The NAS Parallel Benchmarks 2.0, Technical Report NAS-95-020, NASA (December 1995)

    Google Scholar 

  4. Bershad, B.N., Lazowska, E.D., Levy, H.M.: The Performance Implications of Thread Management Alternatives for Shared Memory Multiprocessors. IEEE Trans. on Comp. 38(12), 1631–1644 (1989)

    Article  Google Scholar 

  5. Bhandarkar, M., Kale, L.V., de Sturler, E., Hoeflinger, J.: Object-Based Adaptive Load Balancing for MPI Programs. In: Alexandrov, V.N., Dongarra, J.J., Juliano, B.A., Renner, R.S., Tan, C.J.K. (eds.) ICCS 2001. LNCS, vol. 2074, pp. 108–117. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  6. Black, D.L.: Scheduling support for concurrency and parallelism in the Mach operating system. Computer 23(5), 35–43 (1990), [16] Silicon Graphics, Inc. IRIX Admin: Resource Administration, Document number 007-3700-005 (2000), http://techpubs.sgi.com

    Google Scholar 

  7. Bryant, R.M., Chang, H.-Y., Rosenburg, B.: Experience developing the RP3 operating system. Computing Systems 4(3), 183–216 (1991)

    Google Scholar 

  8. Feitelson, D.: Job Scheduling in Multiprogrammed Parallel Systems. IBM Research Report RC 19790 (87657) (October 1994), Second Revision (August 1997)

    Google Scholar 

  9. Feitelson, D.G., Jette, M.A.: Improved Utilization and Responsiveness with Gang Scheduling. In: Feitelson, D.G., Rudolph, L. (eds.) JSSPP 1997. LNCS, vol. 1291, Springer, Heidelberg (1997)

    Google Scholar 

  10. Frachtenberg, E., Feitelson, D., Petrini, F., Fernandez, J.: Flexible CoSheduling: Mitigating Load Imbalance and Improving Utilization of Heterogeneous Resources. In: IPDPS 2003 (2003)

    Google Scholar 

  11. Gupta, R.: Synchronization and Comunication Costs of Loop Partitioning on Shared-Memory Multiprocessor Systems. In: Gupta, R. (ed.) ICPP 1999, pp. II:23–30 (1989)

    Google Scholar 

  12. Gupta, A., Tucker, A., Urushibara, S.: The impact of operating system scheduling policies and synchronization methods on the performance of parallel applications. In: SIGMETRICS Conf. Measurement & Modeling of Comp. Syst., pp. 120–132 (May 1991)

    Google Scholar 

  13. Hofmann, F., Dal Cin, M., Grygier, A., Hessenauer, H., Hildebrand, U., Linster, C., Thiel, T., Turowski, S.: MEMSY: a modular expandable multiprocessor system. In: Dal Cin, M., Bode, A. (eds.) Parallel Computer Architectures. LNCS, vol. 732, pp. 15–30. Springer, Heidelberg (1993)

    Google Scholar 

  14. LeBlanc, T., Scott, M., Brown, C.: Largescale parallel programming: experience with the BBN Butterfly parallel processor. In: Proc. ACM/SIGPLAN, pp. 161–172 (July 1988)

    Google Scholar 

  15. Martorell, X., Corbalan, J., Nikolopoulos, D., Navarro, J.I., Polychronopoulos, E., Papatheodorou, T., Labarta, J.: A Tool to Schedule Parallel Applications on Multiprocessors: the NANOS CPU Manager. In: Feitelson, D.G., Rudolph, L. (eds.) IPDPS-WS 2000 and JSSPP 2000. LNCS, vol. 1911, pp. 55–69. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  16. Message Passing Interface Forum. MPI: A Message-Passing Interface standard. Journal of SuperComputer Jobs 8(3/4), 165–414 (1994)

    Google Scholar 

  17. Moreira, J.E., Chan, W., Fong, L.L., Franke, H., Jette, M.A.: An Infrastructure for Efficient Parallel Job Execution in Terascale Computing Environments. In: SC 1998(1998)

    Google Scholar 

  18. Nagar, S., Banerjee, A., Sivasubramaniam, A., Das, C.R.: A Closer Look at Coscheduling Approaches for a Network of Workstations. In: 11th ACM Symp. on Parallel Algorithms

    Google Scholar 

  19. Scott, M., LeBlanc, T., Marsh, B., Becker, T., Dubnicki, C., Markatos, E., Smithline, N.: Implementation issues for the Psyche multiprocessor operating system. Comp. Syst. 3(1), 101–137 (1990)

    Google Scholar 

  20. Serra, A., Navarro, N., Cortes, T.: DITools: Applicationlevel Support for oids Dynamic Extension and Flexible Composition. In: Proc. of the USENIX Annual Technical Conference, pp. 225–238 (June 2000)

    Google Scholar 

  21. Squillante, M.S., Nelson, R.D.: Analysis of Task Migration in Shared-Memory Multiprocessor Scheduling. In: Proc. of the 1991 ACM SIGMETRICS Conf. on Measurement and Modeling of Comp. Syst., pp. 143–145 (May 1991)

    Google Scholar 

  22. Thomas, R., Crowther, W.: The Uniform System: An Approach to Runtime Support for Large Scale Shared Memory Parallel Processors. In: Proc. of the ICPP 1988, pp. 245–254 (August 1998)

    Google Scholar 

  23. Tucker, A., Gupta, A.: Process control and scheduling issues for multiprogrammed shared-memory multiprocessors. In: Proc. of the SOSP 1989, pp.159–166 (December 1989)

    Google Scholar 

  24. Utrera, G., Corbalan, J., Labarta, J.: Scheduling of MPI applications: Self Co-Scheduling. In: Danelutto, M., Vanneschi, M., Laforenza, D. (eds.) Euro-Par 2004. LNCS, vol. 3149, Springer, Heidelberg (2004)

    Google Scholar 

  25. Utrera, G., Corbalan, J., Labarta, J.: Implementing Malleability on MPI Jobs. In: PACT 2004, pp. 215–224 (2004)

    Google Scholar 

  26. Vaswani, R., Zahorjan, J.: Implications of Cache Affinity on Processor Scheduling for Multiprogrammed, Shared Memory Multiprocessors. In: Proc. SOSP 1991, pp. 26–40 (October 1991)

    Google Scholar 

  27. www.nas.gov/News/Techreports/2003/PDF/nas-03-010.pdf

Download references

Author information

Authors and Affiliations

Authors

Editor information

Jesús Labarta Kazuki Joe Toshinori Sato

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Utrera, G., Corbalán, J., Labarta, J. (2008). Dynamic Load Balancing in MPI Jobs. In: Labarta, J., Joe, K., Sato, T. (eds) High-Performance Computing. ISHPC ALPS 2005 2006. Lecture Notes in Computer Science, vol 4759. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-77704-5_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-77704-5_10

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-77703-8

  • Online ISBN: 978-3-540-77704-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics