Performance Evaluation of Work Stealing for Streaming Applications

  • Jonatha Anselmi
  • Bruno Gaujal
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5923)


This paper studies the performance of parallel stream computations on a multiprocessor architecture using a work-stealing strategy. Incoming tasks are split in a number of jobs allocated to the processors and whenever a processor becomes idle, it steals a fraction (typically half) of the jobs from a busy processor. We propose a new model for the performance analysis of such parallel stream computations. This model takes into account both the algorithmic behavior of work-stealing as well as the hardware constraints of the architecture (synchronizations and bus contentions). Then, we show that this model can be solved using a recursive formula. We further show that this recursive analytical approach is more efficient than the classic global balance technique. However, our method remains computationally impractical when tasks split in many jobs or when many processors are considered. Therefore, bounds are proposed to efficiently solve very large models in an approximate manner. Experimental results show that these bounds are tight and robust so that they immediately find applications in optimization studies. An example is provided for the optimization of energy consumption with performance constraints. In addition, our framework is flexible and we show how it adapts to deal with several stealing strategies.


Work Stealing Performance Evaluation Markov Model 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Description of traviata architecture (2008),
  2. 2.
    kaapi software (MOAIS, INRIA project-team) (2008),
  3. 3.
    Acar, U.A., Blelloch, G.E., Blumofe, R.D.: The data locality of work stealing. In: SPAA 2000: Proc. of the twelfth annual ACM symposium on Parallel algorithms and architectures, pp. 1–12. ACM, New York (2000)CrossRefGoogle Scholar
  4. 4.
    Anselmi, J., Gaujal, B.: Performance analysis of work stealing for streaming systems and optimizations. Technical Report 6988, INRIA (2009)Google Scholar
  5. 5.
    Arora, N.S., Blumofe, R.D., Plaxton, C.G.: Thread scheduling for multiprogrammed multiprocessors. Theory Comput. Syst. 34(2), 115–144 (2001)zbMATHCrossRefMathSciNetGoogle Scholar
  6. 6.
    Bender, M.A., Rabib, M.O.: Online scheduling of parallel programs on heterogeneous systems with applications to cilk. Theory of Computing Systems 35, 289–304 (2002); Special issue on SPA00zbMATHCrossRefMathSciNetGoogle Scholar
  7. 7.
    Berenbrink, P., Friedetzky, T., Goldberg, L.A.: The natural work-stealing algorithm is stable. In: Proc. of the 42nd FOCS, pp. 178–187. IEEE, Los Alamitos (2001)Google Scholar
  8. 8.
    Bernard, J., Roch, J.-L., Traore, D.: Processor-oblivious parallel stream computations. In: 16th Euromicro International Conference on Parallel, Distributed and network-based Processing, Toulouse, France (February 2008)Google Scholar
  9. 9.
    Blumofe, R.D., Leiserson, C.E.: Scheduling multithreaded computations by work stealing. Journal of the ACM 46(5), 720–748 (1999)zbMATHCrossRefMathSciNetGoogle Scholar
  10. 10.
    Bolch, G., Greiner, S., de Meer, H., Trivedi, K.S.: Queueing Networks and Markov Chains. Wiley-Int., Chichester (2005)Google Scholar
  11. 11.
    Frigo, M., Leiserson, C.E., Randall, K.H.: The implementation of the cilk-5 multithreaded language. In: PLDI 1998: Proc. of SIGPLAN 1998 conf. on Progr. lang. design and implementation, pp. 212–223. ACM, New York (1998)CrossRefGoogle Scholar
  12. 12.
    Gumbel, E.J.: Statistics of extremes. Columbia University Press, New York (1958)zbMATHGoogle Scholar
  13. 13.
    Rabaey, J., Pedram, M.: Low Power Design Methodologies. Kluwer Academic Publishers, Dordrecht (1996)Google Scholar
  14. 14.
    Jafar, S., Gautier, T., Krings, A., Roch, J.-L.: A checkpoint/recovery model for heterogeneous dataflow computations using work-stealing. In: Proc. European Conf. Parallel Processing (EuroPar 2005), pp. 675–684 (2005)Google Scholar
  15. 15.
    Lazowska, E.D., Zahorjan, J., Graham, G.S., Sevcik, K.C.: Quantitative system performance. Prentice-Hall, Upper Saddle River (1984)Google Scholar
  16. 16.
    Neill, D., Wierman, A.: On the benefits of work stealing in shared-memory multiprocessors,
  17. 17.
    Squillante, M.S., Nelson, R.D.: Analysis of task migration in shared-memory multiprocessor scheduling. SIGMETRICS Perf. Eval. Rev. 19(1), 143–155 (1991)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Jonatha Anselmi
    • 1
  • Bruno Gaujal
    • 1
  1. 1.INRIA and LIG LaboratoryMontBonnot Saint-MartinFR

Personalised recommendations