Scheduling Efficiently for Irregular Load Distributions in a Large-scale Cluster

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3758)


Random stealing is a well-known dynamic scheduling algorithm. However, in a large-scale cluster, an idle node must randomly steal many times to obtain a task from another node, especially, this problem severely affects performance in systems where only a few nodes generate most of the system workload. In this paper, we present an efficient dynamic scheduling algorithm, Transitive Random Stealing (TRS) based on random stealing, which makes any idle node rapidly obtain a task from another node for irregular load distributions in a large-scale cluster. Then by the random baseline technique, we experimentally compare TRS with Shis, one of load balance policies in the EARTH system, and random stealing for different load distributions in the Tsinghua EastSun cluster and show that TRS is a highly efficient scheduling algorithm for irregular load distributions in a large-scale cluster. Finally, TRS is implemented in the Jcluster environment, a high performance Java parallel environment, and an experiment result is given in the HKU Gideon 300 cluster.


Scheduling irregular load distribution large-scale cluster transitive random stealing 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
  2. 2.
    Berenbrink, P., Friedetzky, T., Goldberg, L.A.: The Natural Work-Stealing Algorithm is Stable. SIAM Journal on Computing 32(5), 1260–1279 (2003)zbMATHCrossRefMathSciNetGoogle Scholar
  3. 3.
    Blumofe, R.D., Joerg, C.F., Kuszmaul, B.C., Leiserson, C.E., Randall, K.H., Zhou, Y.: Cilk: An efficient multithreaded runtime system. In: Proceedings of the 5th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP 1995, Santa Barbara, California, July 1995, pp. 207–216 (1995)Google Scholar
  4. 4.
    Blumofe, R.D., Leiserson, C.E.: Scheduling Multithreaded Computations by Work Stealing. In: Proceedings of the 35th Annual IEEE conference on Foundations of Computer Science (FOCS 1994), Santa Fe, Mexico, November 20-22 (1994)Google Scholar
  5. 5.
    Cai, H., Maquelin, O., Kakulavarapu, P., Gao, G.R.: Design and Evaluation of Dynamic Load Balancing Schemes under a Fine-grain Multithreaded Execution Model. In: Proc. of the Multithreaded Execution Architecture and Compilation Workshop, Orlando, Florida (January 1999); Delaware (May 1999) Google Scholar
  6. 6.
    Eager, D.L., Lazowska, E.D., Zahorjan, J.: A Comparison of Receiver-Initiated and Sender-Initiated Adaptive Load Sharing. Performance Evaluation 6, 53–68 (1986)CrossRefGoogle Scholar
  7. 7.
    Hum, H.H.J., Maquelin, O., Theobald, K.B., Tian, X., Tang, X., Gao, G.R., Cupryk, P., Elmasri, N., Hendren, L.-r.J., Jimenez, A., Krishnan, S., Marquez, A., Merali, S., Nemawarkar, S.S., Panangaden, P., Xue, X., Zhu, Y.: A design study of the EARTH multiprocessor. In: Bic, L., Bohm, W., Evripidou, P., Gaudiot, J.-L. (eds.) Proceedings of the IFIP WG 10.3 Working Conference on Parallel Architectures and Compilation Techniques, PACT 1995, Limassol, Cyprus, June 27-29, pp. 59–68. ACM Press, New York (1995)Google Scholar
  8. 8.
    Mao, Z.M., So, H.S.W., Woo, A.: JAWS: A Java Work Stealing Scheduler Over a Network of Workstations., Technical report, The University of California at Berkeley (June 1998)Google Scholar
  9. 9.
    van Nieuwpoort, R.V., Kielmann, T., Bal, H.: Satin: Efficient Parallel Divide and Conquer in Java. In: Proc. Euro-Par 2000, Munich, Germany, August 29-Sepetember 1, pp. 690–699 (2000)Google Scholar
  10. 10.
    Sanders, P.: Randomized receiver initiated load balancing algorithms for tree shaped computations. The Computer Journal 45(5), 561–573 (2002)zbMATHCrossRefGoogle Scholar
  11. 11.
    Shivaratri, N.G., Krueger, P.: Two Adaptive Location Policies for Global Scheduling Algorithms. In: IEEE International Conference on Distributed Computing Systems (1990)Google Scholar
  12. 12.
    Shivaratri, N.G., Krueger, P., Ginghal, M.: Load Distributing for Locally Distributed Systems. IEEE Computer 25(12), 33–44 (1992)Google Scholar
  13. 13.
    Wu, I.C., Kung, H.: Communication Complexity for Parallel Divide and Conquer. In: 32nd Annual Symposium on Foundations of Computer Science (FOCS 1991), San Juan, Puerto Rico, October 1991, pp. 151–162 (1991)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  1. 1.Institute of Applied Physics and Computational MathematicsBeijingP.R. China
  2. 2.Department of Computer Science and TechnologyTsinghua UniversityBeijingP.R. China

Personalised recommendations