Rollout Algorithms for Discrete Optimization: A Survey

Reference work entry


This chapter discusses rollout algorithms, a sequential approach to optimization problems, whereby the optimization variables are optimized one after the other. A rollout algorithm starts from some given heuristic and constructs another heuristic with better performance than the original. The method is particularly simple to implement and is often surprisingly effective. This chapter explains the method and its properties for discrete deterministic optimization problems.


Destination Node Model Predictive Control Local Search Method Policy Iteration Origin Node 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Recommended Reading

  1. 1.
    B. Abramson, Expected-outcome: a general model of static evaluation. IEEE Trans. Pattern Anal. Mach. Intell. 12, 182–193 (1990)CrossRefGoogle Scholar
  2. 2.
    D.P. Bertsekas, Network Optimization: Continuous and Discrete Models (Athena Scientific, Belmont, 1998)MATHGoogle Scholar
  3. 3.
    D.P. Bertsekas, Dynamic Programming and Optimal Control, vol. I (Athena Scientific, Belmont, 2005)MATHGoogle Scholar
  4. 4.
    D.P. Bertsekas, Dynamic programming and suboptimal control: a survey from ADP to MPC, in Fundamental Issues in Control. Eur J. Control, 11(4–5), 310–334 (2005)Google Scholar
  5. 5.
    D.P. Bertsekas, D.A. Castanon, Rollout algorithms for stochastic scheduling problems. Heuristics, 5, 89–108 (1999)CrossRefMATHGoogle Scholar
  6. 6.
    D.P. Bertsekas, J.N. Tsitsiklis, Neuro-Dynamic Programming (Athena Scientific, Belmont, 1996)MATHGoogle Scholar
  7. 7.
    D.P. Bertsekas, J.N. Tsitsiklis, C. Wu, Rollout algorithms for combinatorial optimization. Heuristics, 3, 245–262 (1997)CrossRefMATHGoogle Scholar
  8. 8.
    D. Bertsimas, I. Popescu, Revenue management in a dynamic network environment. Transp. Sci. 37, 257–277 (2003)CrossRefGoogle Scholar
  9. 9.
    C. Besse, B. Chaib-draa, Parallel rollout for online solution of DEC-POMDPs, in Proceedings of 21st International FLAIRS Conference, Coconut Grove, FL, 2008, pp. 619–624Google Scholar
  10. 10.
    H.S. Chang, R.L. Givan, E.K.P. Chong, Parallel rollout for online solution of partially observable Markov decision processes. Discret. Event Dyn. Syst. 14, 309–341 (2004)MathSciNetCrossRefMATHGoogle Scholar
  11. 11.
    J.D. Christodouleas, Solution methods for multiprocessor network scheduling problems with application to railroad operations, Ph.D. thesis, Operations Research Center, Massachusetts Institute of Technology, 1997Google Scholar
  12. 12.
    C. Duin, S. Voss, The pilot method: a strategy for heuristic repetition with application to the Steiner problem in graphs. Networks, 34, 181–191 (1999)MathSciNetCrossRefMATHGoogle Scholar
  13. 13.
    M.C. Ferris, M.M. Voelker, Neuro-dynamic programming for radiation treatment planning. Numerical Analysis Group Research Report NA-02/06, Oxford University Computing Laboratory, Oxford University, 2002Google Scholar
  14. 14.
    M.C. Ferris, M.M. Voelker, Fractionation in radiation treatment planning. Math. Program. B 102, 387–413 (2004)MathSciNetGoogle Scholar
  15. 15.
    F. Guerriero, M. Mancini, A cooperative parallel rollout algorithm for the sequential ordering problem. Parallel Comput. 29, 663–677 (2003)CrossRefGoogle Scholar
  16. 16.
    A. McGovern, E. Moss, A. Barto, Building a basic building block scheduler using reinforcement learning and rollouts. Mach. Learn. 49, 141–160 (2002)CrossRefMATHGoogle Scholar
  17. 17.
    C. Meloni, D. Pacciarelli, M. Pranzo, A rollout metaheuristic for job shop scheduling problems. Ann. Oper. Res. 131, 215–235 (2004)MathSciNetCrossRefMATHGoogle Scholar
  18. 18.
    J. Pearl, Heuristics (Addison-Wesley, Reading, 1984)Google Scholar
  19. 19.
    U. Savagaonkar, R. Givan, E.K.P. Chong, Sampling techniques for zero-sum, discounted Markov games, in Proceedings of 40th Allerton Conference on Communication, Control and Computing, Monticello, IL, 2002Google Scholar
  20. 20.
    N. Secomandi, Comparing neuro-dynamic programming algorithms for the vehicle routing problem with stochastic demands. Comput. Oper. Res. 27, 1201–1225 (2000)CrossRefMATHGoogle Scholar
  21. 21.
    N. Secomandi, A rollout policy for the vehicle routing problem with stochastic demands. Oper. Res. 49, 796–802 (2001)CrossRefMATHGoogle Scholar
  22. 22.
    N. Secomandi, Analysis of a rollout approach to sequencing problems with stochastic routing applications. J. Heuristics, 9, 321–352 (2003)CrossRefMATHGoogle Scholar
  23. 23.
    T. Sun, Q. Zhao, P. Lun, R. Tomastik, Optimization of joint replacement policies for multipart systems by a rollout framework. IEEE Trans. Autom. Sci. Eng. 5, 609–619 (2008)CrossRefGoogle Scholar
  24. 24.
    G. Tesauro, G.R. Galperin, On-line policy improvement using Monte Carlo search. Presented at the 1996 neural information processing systems conference, Denver, CO, 1996; also in Advances in Neural Information Processing Systems 9, ed. by M. Mozer et al. (MIT, 1997)Google Scholar
  25. 25.
    F. Tu, K.R. Pattipati, Rollout strategies for sequential fault diagnosis. IEEE Trans. Syst. Man Cybern. Part A 33, 86–99 (2003)Google Scholar
  26. 26.
    G. Wu, E.K.P. Chong, R.L. Givan, Congestion control using policy rollout, in Proceedings of 2nd IEEE CDC, Maui, HI, 2003, pp. 4825–4830Google Scholar
  27. 27.
    X. Yan, P. Diaconis, P. Rusmevichientong, B. Van Roy, Solitaire: man versus machine. Adv. Neural Inf. Process. Syst. 17, 1553–1560 (2005).Google Scholar

Copyright information

© Springer Science+Business Media New York 2013

Authors and Affiliations

  1. 1.Laboratory for Information and Decision SystemsMassachusetts Institute of TechnologyCambridge, MAUSA

Personalised recommendations