Performance Evaluation of Work Stealing for Streaming Applications
This paper studies the performance of parallel stream computations on a multiprocessor architecture using a work-stealing strategy. Incoming tasks are split in a number of jobs allocated to the processors and whenever a processor becomes idle, it steals a fraction (typically half) of the jobs from a busy processor. We propose a new model for the performance analysis of such parallel stream computations. This model takes into account both the algorithmic behavior of work-stealing as well as the hardware constraints of the architecture (synchronizations and bus contentions). Then, we show that this model can be solved using a recursive formula. We further show that this recursive analytical approach is more efficient than the classic global balance technique. However, our method remains computationally impractical when tasks split in many jobs or when many processors are considered. Therefore, bounds are proposed to efficiently solve very large models in an approximate manner. Experimental results show that these bounds are tight and robust so that they immediately find applications in optimization studies. An example is provided for the optimization of energy consumption with performance constraints. In addition, our framework is flexible and we show how it adapts to deal with several stealing strategies.
KeywordsWork Stealing Performance Evaluation Markov Model
Unable to display preview. Download preview PDF.
- 1.Description of traviata architecture (2008), http://www.stlinux.com/drupal/hw/boards/mb426
- 2.kaapi software (MOAIS, INRIA project-team) (2008), http://gforge.inria.fr/projects/kaapi
- 4.Anselmi, J., Gaujal, B.: Performance analysis of work stealing for streaming systems and optimizations. Technical Report 6988, INRIA (2009)Google Scholar
- 7.Berenbrink, P., Friedetzky, T., Goldberg, L.A.: The natural work-stealing algorithm is stable. In: Proc. of the 42nd FOCS, pp. 178–187. IEEE, Los Alamitos (2001)Google Scholar
- 8.Bernard, J., Roch, J.-L., Traore, D.: Processor-oblivious parallel stream computations. In: 16th Euromicro International Conference on Parallel, Distributed and network-based Processing, Toulouse, France (February 2008)Google Scholar
- 10.Bolch, G., Greiner, S., de Meer, H., Trivedi, K.S.: Queueing Networks and Markov Chains. Wiley-Int., Chichester (2005)Google Scholar
- 13.Rabaey, J., Pedram, M.: Low Power Design Methodologies. Kluwer Academic Publishers, Dordrecht (1996)Google Scholar
- 14.Jafar, S., Gautier, T., Krings, A., Roch, J.-L.: A checkpoint/recovery model for heterogeneous dataflow computations using work-stealing. In: Proc. European Conf. Parallel Processing (EuroPar 2005), pp. 675–684 (2005)Google Scholar
- 15.Lazowska, E.D., Zahorjan, J., Graham, G.S., Sevcik, K.C.: Quantitative system performance. Prentice-Hall, Upper Saddle River (1984)Google Scholar
- 16.Neill, D., Wierman, A.: On the benefits of work stealing in shared-memory multiprocessors, http://www.cs.cmu.edu/~acw/15740/paper.pdf