Skip to main content

Rollout Algorithms for Discrete Optimization: A Survey

  • Reference work entry
  • First Online:
Handbook of Combinatorial Optimization

Abstract

This chapter discusses rollout algorithms, a sequential approach to optimization problems, whereby the optimization variables are optimized one after the other. A rollout algorithm starts from some given heuristic and constructs another heuristic with better performance than the original. The method is particularly simple to implement and is often surprisingly effective. This chapter explains the method and its properties for discrete deterministic optimization problems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 3,400.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 549.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    In the case where there are multiple arcs connecting a node pair, all these arcs can be merged to a single arc, since the set of destination nodes that can be reached from any non-destination node will not be affected.

  2. 2.

    For an example where this convention for tie-breaking is not observed and as a consequence \(\mathcal{R}\mathcal{H}\) does not terminate, assume that there is a single destination d and that all other nodes are arranged in a cycle. Each non-destination node i has two outgoing arcs: one arc that belongs to the cycle and another arc which is (i, d). Let \(\mathcal{H}\) be the (sequentially consistent) base heuristic that, starting from a node i≠d, generates the path (i, d). When the terminal node of the path is node i, the rollout algorithm \(\mathcal{R}\mathcal{H}\) compares the two neighbors of i, which are d and the node next to i on the cycle, call it j. Both neighbors have d as their projection, so there is tie in Eq. (6). It can be seen that if \(\mathcal{R}\mathcal{H}\) breaks ties in favor of the neighbor j that lies on the cycle, then \(\mathcal{R}\mathcal{H}\) continually repeats the cycle and never terminates.

  3. 3.

    It is assumed here that there are no termination/cycling difficulties of the type illustrated in the footnote following Example 3.

Recommended Reading

  1. B. Abramson, Expected-outcome: a general model of static evaluation. IEEE Trans. Pattern Anal. Mach. Intell. 12, 182–193 (1990)

    Article  Google Scholar 

  2. D.P. Bertsekas, Network Optimization: Continuous and Discrete Models (Athena Scientific, Belmont, 1998)

    MATH  Google Scholar 

  3. D.P. Bertsekas, Dynamic Programming and Optimal Control, vol. I (Athena Scientific, Belmont, 2005)

    MATH  Google Scholar 

  4. D.P. Bertsekas, Dynamic programming and suboptimal control: a survey from ADP to MPC, in Fundamental Issues in Control. Eur J. Control, 11(4–5), 310–334 (2005)

    Google Scholar 

  5. D.P. Bertsekas, D.A. Castanon, Rollout algorithms for stochastic scheduling problems. Heuristics, 5, 89–108 (1999)

    Article  MATH  Google Scholar 

  6. D.P. Bertsekas, J.N. Tsitsiklis, Neuro-Dynamic Programming (Athena Scientific, Belmont, 1996)

    MATH  Google Scholar 

  7. D.P. Bertsekas, J.N. Tsitsiklis, C. Wu, Rollout algorithms for combinatorial optimization. Heuristics, 3, 245–262 (1997)

    Article  MATH  Google Scholar 

  8. D. Bertsimas, I. Popescu, Revenue management in a dynamic network environment. Transp. Sci. 37, 257–277 (2003)

    Article  Google Scholar 

  9. C. Besse, B. Chaib-draa, Parallel rollout for online solution of DEC-POMDPs, in Proceedings of 21st International FLAIRS Conference, Coconut Grove, FL, 2008, pp. 619–624

    Google Scholar 

  10. H.S. Chang, R.L. Givan, E.K.P. Chong, Parallel rollout for online solution of partially observable Markov decision processes. Discret. Event Dyn. Syst. 14, 309–341 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  11. J.D. Christodouleas, Solution methods for multiprocessor network scheduling problems with application to railroad operations, Ph.D. thesis, Operations Research Center, Massachusetts Institute of Technology, 1997

    Google Scholar 

  12. C. Duin, S. Voss, The pilot method: a strategy for heuristic repetition with application to the Steiner problem in graphs. Networks, 34, 181–191 (1999)

    Article  MathSciNet  MATH  Google Scholar 

  13. M.C. Ferris, M.M. Voelker, Neuro-dynamic programming for radiation treatment planning. Numerical Analysis Group Research Report NA-02/06, Oxford University Computing Laboratory, Oxford University, 2002

    Google Scholar 

  14. M.C. Ferris, M.M. Voelker, Fractionation in radiation treatment planning. Math. Program. B 102, 387–413 (2004)

    MathSciNet  Google Scholar 

  15. F. Guerriero, M. Mancini, A cooperative parallel rollout algorithm for the sequential ordering problem. Parallel Comput. 29, 663–677 (2003)

    Article  Google Scholar 

  16. A. McGovern, E. Moss, A. Barto, Building a basic building block scheduler using reinforcement learning and rollouts. Mach. Learn. 49, 141–160 (2002)

    Article  MATH  Google Scholar 

  17. C. Meloni, D. Pacciarelli, M. Pranzo, A rollout metaheuristic for job shop scheduling problems. Ann. Oper. Res. 131, 215–235 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  18. J. Pearl, Heuristics (Addison-Wesley, Reading, 1984)

    Google Scholar 

  19. U. Savagaonkar, R. Givan, E.K.P. Chong, Sampling techniques for zero-sum, discounted Markov games, in Proceedings of 40th Allerton Conference on Communication, Control and Computing, Monticello, IL, 2002

    Google Scholar 

  20. N. Secomandi, Comparing neuro-dynamic programming algorithms for the vehicle routing problem with stochastic demands. Comput. Oper. Res. 27, 1201–1225 (2000)

    Article  MATH  Google Scholar 

  21. N. Secomandi, A rollout policy for the vehicle routing problem with stochastic demands. Oper. Res. 49, 796–802 (2001)

    Article  MATH  Google Scholar 

  22. N. Secomandi, Analysis of a rollout approach to sequencing problems with stochastic routing applications. J. Heuristics, 9, 321–352 (2003)

    Article  MATH  Google Scholar 

  23. T. Sun, Q. Zhao, P. Lun, R. Tomastik, Optimization of joint replacement policies for multipart systems by a rollout framework. IEEE Trans. Autom. Sci. Eng. 5, 609–619 (2008)

    Article  Google Scholar 

  24. G. Tesauro, G.R. Galperin, On-line policy improvement using Monte Carlo search. Presented at the 1996 neural information processing systems conference, Denver, CO, 1996; also in Advances in Neural Information Processing Systems 9, ed. by M. Mozer et al. (MIT, 1997)

    Google Scholar 

  25. F. Tu, K.R. Pattipati, Rollout strategies for sequential fault diagnosis. IEEE Trans. Syst. Man Cybern. Part A 33, 86–99 (2003)

    Google Scholar 

  26. G. Wu, E.K.P. Chong, R.L. Givan, Congestion control using policy rollout, in Proceedings of 2nd IEEE CDC, Maui, HI, 2003, pp. 4825–4830

    Google Scholar 

  27. X. Yan, P. Diaconis, P. Rusmevichientong, B. Van Roy, Solitaire: man versus machine. Adv. Neural Inf. Process. Syst. 17, 1553–1560 (2005).

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dimitri P. Bertsekas .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer Science+Business Media New York

About this entry

Cite this entry

Bertsekas, D.P. (2013). Rollout Algorithms for Discrete Optimization: A Survey. In: Pardalos, P., Du, DZ., Graham, R. (eds) Handbook of Combinatorial Optimization. Springer, New York, NY. https://doi.org/10.1007/978-1-4419-7997-1_8

Download citation

Publish with us

Policies and ethics