Skip to main content

The sandpile scheduler

How self-organized criticality may lead to dynamic load-balancing

Abstract

This paper studies a self-organized criticality model called sandpile for dynamically load-balancing tasks arriving in the form of Bag-of-Tasks in large-scale decentralized system. The sandpile is designed as a decentralized agent system characterizing a cellular automaton, which works in a critical state at the edge of chaos. Depending on the state of the cellular automaton, different responses may occur when a new task is assigned to a resource: it may change nothing or generate avalanches that reconfigure the state of the system. The abundance of such avalanches is in power-law relation with their sizes, a scale-invariant behavior that emerges without requiring tuning or control parameters. That means that large—catastrophic—avalanches are very rare but small ones occur very often. Such emergent pattern can be efficiently adapted for non-clairvoyant scheduling, where tasks are load balanced in computing resources trying to maximize the performance but without assuming any knowledge on the tasks features. The algorithm design is experimentally validated showing that the sandpile is able to find near-optimal schedules by reacting differently to different conditions of workloads and architectures.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Algorithm 1
Fig. 3
Fig. 4
Fig. 5
Algorithm 2
Algorithm 3
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Notes

  1. These two sites, β 1 and β 2, should be selected according to some criterion out of all neighbors of α. For the sake of simplicity, this study assumes that the neighbors are selected uniformly at random.

  2. Source-code with the simulator is available at https://sandpile-scheduler.googlecode.com, published under GPL v3 public license.

References

  1. de Arcangelis, L., Herrmann, H.: Self-organized criticality on small world networks. Physica A: Stat. Mech. Appl. 308(1–4), 545–549 (2002). doi:10.1016/S0378-4371(02)00549-6

    Article  MATH  Google Scholar 

  2. Bak, P., Sneppen, K.: Punctuated equilibrium and criticality in a simple model of evolution. Phys. Rev. Lett. 71, 4083–4086 (1993). doi:10.1103/PhysRevLett.71.4083

    Article  Google Scholar 

  3. Bak, P., Tang, C., Wiesenfeld, K.: Self-organized criticality: an explanation of the 1/f noise. Phys. Rev. Lett. 59, 381–384 (1987). doi:10.1103/PhysRevLett.59.381

    Article  MathSciNet  Google Scholar 

  4. Casanova, H., Gallet, M., Vivien, F.: Non-clairvoyant scheduling of multiple bag-of-tasks applications. In: Proceedings of the 16th International Euro-Par Conference on Parallel Processing: Part I, EuroPar’10, pp. 168–179. Springer, Berlin (2010). http://dl.acm.org/citation.cfm?id=1887695.1887715

    Google Scholar 

  5. Chen, C.C., Chiao, L.Y., Lee, Y.T., wen Cheng, H., Wu, Y.M.: Long-range connective sandpile models and its implication to seismicity evolution. Tectonophysics 454(4), 104–107 (2008). doi:10.1016/j.tecto.2008.04.004

    Article  Google Scholar 

  6. Demers, A., Greene, D., Hauser, C., Irish, W., Larson, J., Shenker, S., Sturgis, H., Swinehart, D., Terry, D.: Epidemic algorithms for replicated database maintenance. In: Proceedings of the Sixth Annual ACM Symposium on Principles of Distributed Computing, PODC ’87, pp. 1–12. ACM, New York (1987). doi:10.1145/41840.41841

    Chapter  Google Scholar 

  7. Devine, K.D., Boman, E.G., Heaphy, R.T., Hendrickson, B.A., Teresco, J.D., Faik, J., Flaherty, J.E., Gervasio, L.G.: New challenges in dynamic load balancing. Appl. Numer. Math. 52(2–3), 133–152 (2005). doi:10.1016/j.apnum.2004.08.028

    Article  MATH  MathSciNet  Google Scholar 

  8. Eugster, P.T., Guerraoui, R., Kermarrec, A.M., Massoulieacute, L.: Epidemic information dissemination in distributed systems. Computer 37(5), 60–67 (2004). doi:10.1109/MC.2004.1297243

    Article  Google Scholar 

  9. Franceschelli, M., Giua, A., Seatzu, C.: Load balancing on networks with gossip-based distributed algorithms. In: 46th IEEE Conference on Decision and Control 2007, pp. 500–505 (2007). doi:10.1109/CDC.2007.4434904

    Chapter  Google Scholar 

  10. Franceschelli, M., Giua, A., Seatzu, C.: Load balancing over heterogeneous networks with gossip-based algorithms. In: American Control Conference, ACC ’09, pp. 1987–1993 (2009). doi:10.1109/ACC.2009.5160452

    Google Scholar 

  11. Garey, M.R., Johnson, D.S.: Computers and Intractability; A Guide to the Theory of NP-Completeness. Freeman, New York (1990)

    Google Scholar 

  12. Guinand, F., Semaan, F.: High performance computing environments and sand piles. In: International Conference on Metaheuristics and Nature Inspired Computing (META 2012), Port El Kantaoui, Tunisia (2012)

    Google Scholar 

  13. Hu, J., Klefstad, R.: Decentralized load balancing on unstructured peer-2-peer computing grids. In: Fifth IEEE International Symposium on Network Computing and Applications, NCA 2006, pp. 247–250 (2006). doi:10.1109/NCA.2006.21

    Google Scholar 

  14. Iosup, A., Sonmez, O., Anoep, S., Epema, D.: The performance of bags-of-tasks in large-scale distributed systems. In: Proceedings of the 17th International Symposium on High Performance Distributed Computing, HPDC ’08, pp. 97–108. ACM, New York (2008). doi:10.1145/1383422.1383435

    Chapter  Google Scholar 

  15. Jelasity, M., Guerraoui, R., Kermarrec, A.M., van Steen, M.: The peer sampling service: experimental evaluation of unstructured gossip-based implementations. In: Proceedings of the 5th ACM/IFIP/USENIX International Conference on Middleware, Middleware ’04, pp. 79–98. Springer, New York (2004). http://dl.acm.org/citation.cfm?id=1045658.1045666

    Google Scholar 

  16. Jelasity, M., Montresor, A., Babaoglu, O.: A modular paradigm for building self-organizing peer-to-peer applications. In: Di Marzo Serugendo, G. (ed.) Engineering Self-organising Systems, pp. 265–282. Springer, Berlin (2003)

    Google Scholar 

  17. Laredo, J., Dorronsoro, B., Pecero, J., Bouvry, P., Durillo, J., Fernandes, C.: Designing a self-organized approach for scheduling bag-of-tasks. In: Seventh International Conference on P2P, Parallel, Grid, Cloud and Internet Computing (3PGCIC) 2012, pp. 315–320 (2012). doi:10.1109/3PGCIC.2012.28

    Chapter  Google Scholar 

  18. Semaan, F.: Répartition dynamique de charge et phénomènes d’avalanche (2006)

    Google Scholar 

  19. Subramaniyan, R., Raman, P., George, A., Radlinski, M.: Gems: gossip-enabled monitoring service for scalable heterogeneous distributed systems. Clust. Comput. 9, 101–120 (2006). doi:10.1007/s10586-006-4900-5

    Article  Google Scholar 

  20. Watts, D., Strogatz, S.: Collective dynamics of “small-world” networks. Nature 393, 440–442 (1998). doi:10.1038/30918

    Article  Google Scholar 

  21. Willebeek-LeMair, M., Reeves, A.: Strategies for dynamic load balancing on highly parallel computers. IEEE Trans. Parallel Distrib. Syst. 4(9), 979–993 (1993). doi:10.1109/71.243526

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by the Luxembourg FNR Green@Cloud project (INTER/CNRS/11/03). C. Fernandes wishes to thank FCT Portuguese Ministry of Science, his Research Fellowship SFRH/BPD/66876/2009. B. Dorronsoro acknowledges the support by the Fonds National de la Recherche, Luxembourg (AFR contract no. 4017742).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to J. L. J. Laredo.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Laredo, J.L.J., Bouvry, P., Guinand, F. et al. The sandpile scheduler. Cluster Comput 17, 191–204 (2014). https://doi.org/10.1007/s10586-013-0328-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10586-013-0328-x

Keywords

  • Optimization
  • Self-organization
  • Scheduling
  • Distributed systems