Discrete Event Dynamic Systems

, Volume 21, Issue 1, pp 63–101 | Cite as

A mean field approach for optimization in discrete time

  • Nicolas Gast
  • Bruno Gaujal


This paper investigates the limit behavior of Markov decision processes made of independent objects evolving in a common environment, when the number of objects (N) goes to infinity. In the finite horizon case, we show that when the number of objects becomes large, the optimal cost of the system converges to the optimal cost of a discrete time system that is deterministic. Convergence also holds for optimal policies. We further provide bounds on the speed of convergence by proving second order results that resemble central limits theorems for the cost and the state of the Markov decision process, with explicit formulas for the limit. These bounds (of order \(1/\sqrt{N}\)) are proven to be tight in a numerical example. One can even go further and get convergence of order \(\sqrt{\log N}/N\) to a stochastic system made of the mean field limit and a Gaussian term. Our framework is applied to a brokering problem in grid computing. Several simulations with growing numbers of processors are reported. They compare the performance of the optimal policy of the limit system used in the finite case with classical policies by measuring its asymptotic gain. Several extensions are also discussed. In particular, for infinite horizon cases with discounted costs, we show that first order limits hold and that second order results also hold as long as the discount factor is small enough. As for infinite horizon cases with non-discounted costs, examples show that even the first order limits may not hold.


Mean field Markov decision process Brokering 


  1. Benaïm M, Le Boudec J-Y (2008) A class of mean field interaction models for computer and communication systems. Perform Eval 65(11–12):823–838CrossRefGoogle Scholar
  2. Berten V, Gaujal B (2007a) Brokering strategies in computational grids using stochastic prediction models. Parallel Comput (Special Issue on Large Scale Grids)Google Scholar
  3. Berten V, Gaujal B (2007b) Grid brokering for batch allocation using indexes. In: Euro-FGI NET-COOP. LNCS, Avignon, FranceGoogle Scholar
  4. Bobbio A, Gribaudo M, Telek M (2008) Analysis of large scale interacting systems by mean field method. In: 5th international conference on quantitative evaluation of systems (QEST), St Malo, pp 215–224Google Scholar
  5. Bordenave C, Anantharam V (2007) Optimal control of interacting particle systems. Tech Rep 00397327, CNRS Open-Archive HALGoogle Scholar
  6. Bordenave C, McDonald D, Proutière A (2010) A particle system in interaction with a rapidly varying environment: mean field limits and applications. Networks and Heterogeneous Media (NHM) 5(1):31–62CrossRefGoogle Scholar
  7. Borkar V (2008) Stochastic approximation: a dynamical systems viewpoint. Cambridge University PressGoogle Scholar
  8. Durrett R (1991) Probability: theory and examples. Wadsworth & Brooks/ColeGoogle Scholar
  9. EGEE (2010) Enabling grids for E-scienceGoogle Scholar
  10. Gast N, Gaujal B (2009) A mean field approach for optimization in particle systems and applications. In: Fourth international conference on performance evaluation methodologies and tools, ValueToolsGoogle Scholar
  11. Gast N, Gaujal B, Le Boudec J-Y (2010) Mean field for Markov decision processes: from discrete to continuous optimization. Tech rep, INRIAGoogle Scholar
  12. Graham C (2000) Chaoticity on path space for a queueing network with selection of the shortest queue among several. J Appl Probab 37:198–211zbMATHCrossRefMathSciNetGoogle Scholar
  13. Hochbaum DS (1996) Approximation algorithms for NP-hard problems. PWS Publishing Co. Boston, MA, USAGoogle Scholar
  14. Hoeffding W (1963) Probability inequalities for sums of bounded random variables. J Am Stat Assoc 58(301):13–30zbMATHCrossRefMathSciNetGoogle Scholar
  15. Kurtz T (1978) Strong approximation theorems for density dependent Markov chains. Stoch Process Their Appl. ElsevierGoogle Scholar
  16. Le Boudec J-Y, McDonald D, Mundinger J (2007) A generic mean field convergence result for systems of interacting objects. QEST 2007:3–18Google Scholar
  17. Palmer J, Mitrani I (2005) Optimal and heuristic policies for dynamic server allocation. J Parallel Distrib Comput 65(10):1204–1211. Special issue: design and performance of networks for super-, cluster-, and grid-computing (part I)zbMATHCrossRefGoogle Scholar
  18. Papadimitriou CH, Tsitsiklis JN (1999) The complexity of optimal queueing network control. Math Oper Res 24:293–305zbMATHCrossRefMathSciNetGoogle Scholar
  19. Puterman ML (2005) Markov decision processes: discrete stochastic dynamic programming. Wiley-InterscienceGoogle Scholar
  20. Rolski T (1983) Comparison theorems for queues with dependent interarrival times. In: Lecture notes in control and information sciences, vol 60. Springer, pp 42–71Google Scholar
  21. Schwartz JT (1969) Nonlinear functional analysis. Gordon and Breach Science Publishers, New YorkzbMATHGoogle Scholar
  22. Weber RR, Weiss G (1990) On an index policy for restless bandits. J Appl Probab 27:637–648zbMATHCrossRefMathSciNetGoogle Scholar
  23. Whittle P (1988) A celebration of applied probability. J Appl Probab Spec 25A(chap. restless bandits: activity allocation in a changing world): 287–298Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2010

Authors and Affiliations

  1. 1.Grenoble UniversitéGrenobleFrance
  2. 2.LIGMontbonnotFrance
  3. 3.INRIAMontbonnotFrance

Personalised recommendations