International Journal of Parallel Programming

, Volume 43, Issue 4, pp 597–630 | Cite as

A Framework to Analyze the Performance of Load Balancing Schemes for Ensembles of Stochastic Simulations

  • Tae-Hyuk Ahn
  • Adrian Sandu
  • Layne T. Watson
  • Clifford A. Shaffer
  • Yang Cao
  • William T. Baumann


Ensembles of simulations are employed to estimate the statistics of possible future states of a system, and are widely used in important applications such as climate change and biological modeling. Ensembles of runs can naturally be executed in parallel. However, when the CPU times of individual simulations vary considerably, a simple strategy of assigning an equal number of tasks per processor can lead to serious work imbalances and low parallel efficiency. This paper presents a new probabilistic framework to analyze the performance of dynamic load balancing algorithms for ensembles of simulations where many tasks are mapped onto each processor, and where the individual compute times vary considerably among tasks. Four load balancing strategies are discussed: most-dividing, all-redistribution, random-polling, and neighbor-redistribution. Simulation results with a stochastic budding yeast cell cycle model are consistent with the theoretical analysis. It is especially significant that there is a provable global decrease in load imbalance for the local rebalancing algorithms due to scalability concerns for the global rebalancing algorithms. The overall simulation time is reduced by up to 25 %, and the total processor idle time by 85 %.


Dynamic load balancing (DLB) Probabilistic framework analysis Ensemble simulations Stochastic simulation algorithm (SSA) High-performance computing (HPC) Budding yeast cell cycle 



The authors thank the two anonymous reviewers whose comments helped improve this work. This work is supported in part by awards NIGMS/NIH 5 R01 GM078989, AFOSR FA9550-09-1-0153, NSF DMS-0540675, NSF CCF-0916493, NSF OCI-0904397, NSF DMS-1225160, and NSF CCF-0953590.


  1. 1.
    Ahn, T.-H., Watson, L., Cao, Y., Shaffer, C., Baumann, W.: Cell cycle modeling for budding yeast with stochastic simulation algorithms. Comput. Model. Eng. Sci. 51(1), 27–52 (2009)Google Scholar
  2. 2.
    Ball, D., Ahn, T.-H., Wang, P., Chen, K., Cao, Y., Tyson, J., Peccoud, J., Baumann, W.: Stochastic exit from mitosis in budding yeast: model predictions and experimental observations. Cell Cycle 10, 999–1099 (2011)Google Scholar
  3. 3.
    Bast, H.: Dynamic scheduling with incomplete information. In: Proceedings of the Tenth Annual ACM Symposium on Parallel Algorithms and Architectures (SPAA ’98), pp. 182–191. ACM, New York, NY, USA (1998)Google Scholar
  4. 4.
    Bast, H.: Provably Optimal Scheduling of Similar Tasks. Ph.D. thesis, Universitat des Saarlandes (2000)Google Scholar
  5. 5.
    Bertsekas, D., Tsitsiklis, J.: Parallel and Distributed Computation: Numerical Methods. Prentice-Hall, Upper Saddle River (1989)zbMATHGoogle Scholar
  6. 6.
    Blumofe, R., Leiserson, C.: Scheduling multithreaded computations by work stealing. In: Proceedings of Annunal Symposyum on Foundations of Computer Science, pp. 356–368 (1994)Google Scholar
  7. 7.
    Chen, C.C., Tyler, C.: Accurate approximation to the extreme order statistics of Gaussian samples. Commun. Stat. Simul. Comput. 28(1), 177–188 (1999)CrossRefMathSciNetGoogle Scholar
  8. 8.
    Chen, K., Calzone, L., Csikasz-Nagy, A., Cross, F., Novak, B., Tyson, J.: Integrative analysis of cell cycle control in budding yeast. Mol. Biol. Cell 15(8), 3841–3862 (2004)CrossRefGoogle Scholar
  9. 9.
    Chu, W., Holloway, L., Lan, M.T., Efe, K.: Task allocation in distributed data processing. Computer 13(11), 57–69 (1980)CrossRefGoogle Scholar
  10. 10.
    Cybenko, G.: Dynamic load balancing for distributed memory multiprocessors. J. Parallel Distrib. Comput. 7, 279–301 (1989)CrossRefGoogle Scholar
  11. 11.
    David, H., Nagaraja, H.: Order Statistics, 2nd edn. Wiley, Hoboken (2003)CrossRefzbMATHGoogle Scholar
  12. 12.
    Dijkstra, E., Scholten, C.: Termination detection for diffusing computations. Inf. Process. Lett. 11(1), 1–4 (1980)CrossRefzbMATHMathSciNetGoogle Scholar
  13. 13.
    Flynn, L., Hummel, S.: The Mathematical Foundations of the Factoring Scheduling Method. Tech. rep., IBM Research, Report RC18462 (1992)Google Scholar
  14. 14.
    Gillespie, D.: Exact stochastic simulation of coupled chemical reactions. J. Phys. Chem. 81(25), 2340–2361 (1977)CrossRefGoogle Scholar
  15. 15.
    Grama, A., Karypis, G., Kumar, V., Gupta, A.: Introduction to Parallel Computing, 2nd edn. Addison-Wesley, Boston (2002)Google Scholar
  16. 16.
    Hagerup, T.: Allocating independent tasks to parallel processors: An experimental study. J. Parallel Distrib. Comput. 47(2), 185–197 (1997). Google Scholar
  17. 17.
    Hillis, W.: The Connection Machine. MIT Press, Cambridge (1986)CrossRefGoogle Scholar
  18. 18.
    Hummel, S., Schonberg, E., Flynn, L.: Factoring: a practical and robust method for scheduling parallel loops. Commun. ACM 35(8), 90–101 (1992)CrossRefGoogle Scholar
  19. 19.
    Iqbal, M., Saltz, J., Bokhari, S.: A comparative analysis of static and dynamic load balancing strategies. ACM Perform. Eval. Revis. 11(1), 1040–1047 (1985)Google Scholar
  20. 20.
    Jacob, J., Lee, S.Y.: Task spreading and shrinking on a network of workstations with various edge classes. In: Proceedings of the 1996 International Conference on Parallel Processing, vol. 3, pp. 174–181 (1996)Google Scholar
  21. 21.
    Karp, R., Zhang, Y.: Randomized parallel algorithms for backtrack search and branch-and-bound computation. J. ACM 40, 765–789 (1993)CrossRefzbMATHMathSciNetGoogle Scholar
  22. 22.
    Kruskal, C., Weiss, A.: Allocating independent subtasks on parallel processors. IEEE Trans. Softw. Eng. SE-11(10), 1001–1016 (1985)Google Scholar
  23. 23.
    Lester, B.: The Art of Parallel Programming. Prentice-Hall, Upper Saddle River (1993)Google Scholar
  24. 24.
    Lucco, S.: A dynamic scheduling method for irregular parallel programs. SIGPLAN Not. 27(7), 200–211 (1992)Google Scholar
  25. 25.
    McAdams, H., Arkin, A.: Stochastic mechanisms in gene expression. Proc. Natl. Acad. Sci. 94, 814–819 (1997)CrossRefGoogle Scholar
  26. 26.
    Murphy, J., Sexton, D., Barnett, D., Jones, G., Webb, M., Collins, M., Stainforth, D.: Quantification of modelling uncertainties in a large ensemble of climate change simulations. Nature 430, 768–772 (2004)CrossRefGoogle Scholar
  27. 27.
    Polychronopoulos, C., Kuck, D.: Guided self-scheduling: a practical scheduling scheme for parallel supercomputers. IEEE Trans. Comput. 36, 1425–1439 (1987)CrossRefGoogle Scholar
  28. 28.
    Powley, C., Ferguson, C., Korf, R.: Depth-first heuristic search on a SIMD machine. Artif. Intell. 60(2), 199–242 (1993)CrossRefGoogle Scholar
  29. 29.
    Randles, M., Lamb, D., Taleb-Bendiab, A.: A comparative study into distributed load balancing algorithms for cloud computing. In: 2010 IEEE 24th International Conference on Advanced Information Networking and Applications Workshops (WAINA), pp. 551–556 (2010)Google Scholar
  30. 30.
    Ren, X., Lin, R., Zou, H.: A dynamic load balancing strategy for cloud computing platform based on exponential smoothing forecast. In: 2011 IEEE International Conference on Cloud Computing and Intelligence Systems (CCIS), pp. 220–224 (2011)Google Scholar
  31. 31.
    Rice, J.: Mathematical Statistics and Data Analysis, 3rd edn. Duxbury Press, Belmont (2001)Google Scholar
  32. 32.
    Rudolph, L., Slivkin-Allalouf, M., Upfal, E.: A simple load balancing scheme for task allocation in parallel machines. In: Proceedings of the Third Annual ACM Symposium on Parallel Algorithms and Architectures (SPAA ’91), pp. 237–245. ACM, New York, NY, USA (1991)Google Scholar
  33. 33.
    System X Supercomputer.
  34. 34.
    Shavit, N., Francez, N.: A new approach to detection of locally indicative stability. In: Proceedings of the 13th International Colloquium on Automata, Languages and Programming (ICALP ’86), pp. 344–358. Springer, London (1986)Google Scholar
  35. 35.
    Trivedi, K.: Probability and Statistics with Reliability, Queueing, and Computer Science Applications, 2nd edn. Wiley, Hoboken (2001)Google Scholar
  36. 36.
    Wang, P., Randhawa, R., Shaffer, C., Cao, Y., Baumann, W.: Converting macromolecular regulatory models from deterministic to stochastic formulation. In: Proceedings of the 2008 Spring Simulation Multiconference (SpringSim’08), High Performance Computing Symposium (HPC-2008), pp. 385–392. Society for Computer Simulation International, San Diego, CA, USA (2008)Google Scholar
  37. 37.
    Wilkinson, B., Allen, M.: Parallel Programming: Techniques and Applications Using Networked Workstations and Parallel Computers, 2nd edn. Prentice-Hall, Upper Saddle River (2004)Google Scholar
  38. 38.
    Xu, C.Z., Lau, F.C.M.: Analysis of the generalized dimension exchange method for dynamic load balancing. J. Parallel Distrib. Comput. 16(4), 385–393 (1992)CrossRefzbMATHMathSciNetGoogle Scholar
  39. 39.
    Zhang, Z., Zhang, X.: A load balancing mechanism based on ant colony and complex network theory in open cloud computing federation. In: 2010 2nd International Conference on Industrial Mechatronics and Automation (ICIMA), vol. 2, pp. 240–243 (2010)Google Scholar

Copyright information

© Springer Science+Business Media New York 2014

Authors and Affiliations

  • Tae-Hyuk Ahn
    • 1
  • Adrian Sandu
    • 2
  • Layne T. Watson
    • 3
  • Clifford A. Shaffer
    • 2
  • Yang Cao
    • 2
  • William T. Baumann
    • 4
  1. 1.Computer Science and Mathematics DivisionOak Ridge National LaboratoryOak RidgeUSA
  2. 2.Department of Computer ScienceVirginia Polytechnic Institute and State UniversityBlacksburgUSA
  3. 3.Departments of Computer Science, Mathematics, and Aerospace and Ocean EngineeringVirginia Polytechnic Institute and State UniversityBlacksburgUSA
  4. 4.Department of Electrical and Computer EngineeringVirginia Polytechnic Institute and State UniversityBlacksburgUSA

Personalised recommendations