Cluster Computing

, Volume 16, Issue 2, pp 299–319

Juggle: addressing extrinsic load imbalances in SPMD applications on multicore computers


    • Lawrence Berkeley National Laboratory
  • Juan A. Colmenares
    • Parallel Computing LaboratoryUC Berkeley
  • Costin Iancu
    • Lawrence Berkeley National Laboratory
  • John Kubiatowicz
    • Parallel Computing LaboratoryUC Berkeley

DOI: 10.1007/s10586-012-0204-0

Cite this article as:
Hofmeyr, S., Colmenares, J.A., Iancu, C. et al. Cluster Comput (2013) 16: 299. doi:10.1007/s10586-012-0204-0


We investigate proactive dynamic load balancing on multicore systems, in which threads are continually migrated to reduce the impact of processor/thread mismatches. Our goal is to enhance the flexibility of the SPMD-style programming model and enable SPMD applications to run efficiently in multiprogrammed environments. We present Juggle, a practical decentralized, user-space implementation of a proactive load balancer that emphasizes portability and usability. In this paper we assume perfect intrinsic load balance and focus on extrinsic imbalances caused by OS noise, multiprogramming and mismatches of threads to hardware parallelism. Juggle shows performance improvements of up to 80 % over static load balancing for oversubscribed UPC, OpenMP, and pthreads benchmarks. We also show that Juggle is effective in unpredictable, multiprogrammed environments, with up to a 50 % performance improvement over the Linux load balancer and a 25 % reduction in performance variation. We analyze the impact of Juggle on parallel applications and derive lower bounds and approximations for thread completion times. We show that results from Juggle closely match theoretical predictions across a variety of architectures, including NUMA and hyper-threaded systems.


Proactive load balancingParallel programmingSingle-program multiple-data parallelismOperating systemMulticore

Copyright information

© Springer Science + Business Media, LLC (outside the USA) 2012