Rollout Algorithms for Discrete Optimization: A Survey

Bertsekas, Dimitri P.

doi:10.1007/978-1-4419-7997-1_8

Dimitri P. Bertsekas⁴

7557 Accesses
19 Citations

Abstract

This chapter discusses rollout algorithms, a sequential approach to optimization problems, whereby the optimization variables are optimized one after the other. A rollout algorithm starts from some given heuristic and constructs another heuristic with better performance than the original. The method is particularly simple to implement and is often surprisingly effective. This chapter explains the method and its properties for discrete deterministic optimization problems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 3,400.00; Price excludes VAT (USA)

Hardcover Book: USD 549.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
In the case where there are multiple arcs connecting a node pair, all these arcs can be merged to a single arc, since the set of destination nodes that can be reached from any non-destination node will not be affected.
2.
For an example where this convention for tie-breaking is not observed and as a consequence \(\mathcal{R}\mathcal{H}\) does not terminate, assume that there is a single destination d and that all other nodes are arranged in a cycle. Each non-destination node i has two outgoing arcs: one arc that belongs to the cycle and another arc which is (i, d). Let \(\mathcal{H}\) be the (sequentially consistent) base heuristic that, starting from a node i≠d, generates the path (i, d). When the terminal node of the path is node i, the rollout algorithm \(\mathcal{R}\mathcal{H}\) compares the two neighbors of i, which are d and the node next to i on the cycle, call it j. Both neighbors have d as their projection, so there is tie in Eq. (6). It can be seen that if \(\mathcal{R}\mathcal{H}\) breaks ties in favor of the neighbor j that lies on the cycle, then \(\mathcal{R}\mathcal{H}\) continually repeats the cycle and never terminates.
3.
It is assumed here that there are no termination/cycling difficulties of the type illustrated in the footnote following Example 3.

Recommended Reading

B. Abramson, Expected-outcome: a general model of static evaluation. IEEE Trans. Pattern Anal. Mach. Intell. 12, 182–193 (1990)
Article Google Scholar
D.P. Bertsekas, Network Optimization: Continuous and Discrete Models (Athena Scientific, Belmont, 1998)
MATH Google Scholar
D.P. Bertsekas, Dynamic Programming and Optimal Control, vol. I (Athena Scientific, Belmont, 2005)
MATH Google Scholar
D.P. Bertsekas, Dynamic programming and suboptimal control: a survey from ADP to MPC, in Fundamental Issues in Control. Eur J. Control, 11(4–5), 310–334 (2005)
Google Scholar
D.P. Bertsekas, D.A. Castanon, Rollout algorithms for stochastic scheduling problems. Heuristics, 5, 89–108 (1999)
Article MATH Google Scholar
D.P. Bertsekas, J.N. Tsitsiklis, Neuro-Dynamic Programming (Athena Scientific, Belmont, 1996)
MATH Google Scholar
D.P. Bertsekas, J.N. Tsitsiklis, C. Wu, Rollout algorithms for combinatorial optimization. Heuristics, 3, 245–262 (1997)
Article MATH Google Scholar
D. Bertsimas, I. Popescu, Revenue management in a dynamic network environment. Transp. Sci. 37, 257–277 (2003)
Article Google Scholar
C. Besse, B. Chaib-draa, Parallel rollout for online solution of DEC-POMDPs, in Proceedings of 21st International FLAIRS Conference, Coconut Grove, FL, 2008, pp. 619–624
Google Scholar
H.S. Chang, R.L. Givan, E.K.P. Chong, Parallel rollout for online solution of partially observable Markov decision processes. Discret. Event Dyn. Syst. 14, 309–341 (2004)
Article MathSciNet MATH Google Scholar
J.D. Christodouleas, Solution methods for multiprocessor network scheduling problems with application to railroad operations, Ph.D. thesis, Operations Research Center, Massachusetts Institute of Technology, 1997
Google Scholar
C. Duin, S. Voss, The pilot method: a strategy for heuristic repetition with application to the Steiner problem in graphs. Networks, 34, 181–191 (1999)
Article MathSciNet MATH Google Scholar
M.C. Ferris, M.M. Voelker, Neuro-dynamic programming for radiation treatment planning. Numerical Analysis Group Research Report NA-02/06, Oxford University Computing Laboratory, Oxford University, 2002
Google Scholar
M.C. Ferris, M.M. Voelker, Fractionation in radiation treatment planning. Math. Program. B 102, 387–413 (2004)
MathSciNet Google Scholar
F. Guerriero, M. Mancini, A cooperative parallel rollout algorithm for the sequential ordering problem. Parallel Comput. 29, 663–677 (2003)
Article Google Scholar
A. McGovern, E. Moss, A. Barto, Building a basic building block scheduler using reinforcement learning and rollouts. Mach. Learn. 49, 141–160 (2002)
Article MATH Google Scholar
C. Meloni, D. Pacciarelli, M. Pranzo, A rollout metaheuristic for job shop scheduling problems. Ann. Oper. Res. 131, 215–235 (2004)
Article MathSciNet MATH Google Scholar
J. Pearl, Heuristics (Addison-Wesley, Reading, 1984)
Google Scholar
U. Savagaonkar, R. Givan, E.K.P. Chong, Sampling techniques for zero-sum, discounted Markov games, in Proceedings of 40th Allerton Conference on Communication, Control and Computing, Monticello, IL, 2002
Google Scholar
N. Secomandi, Comparing neuro-dynamic programming algorithms for the vehicle routing problem with stochastic demands. Comput. Oper. Res. 27, 1201–1225 (2000)
Article MATH Google Scholar
N. Secomandi, A rollout policy for the vehicle routing problem with stochastic demands. Oper. Res. 49, 796–802 (2001)
Article MATH Google Scholar
N. Secomandi, Analysis of a rollout approach to sequencing problems with stochastic routing applications. J. Heuristics, 9, 321–352 (2003)
Article MATH Google Scholar
T. Sun, Q. Zhao, P. Lun, R. Tomastik, Optimization of joint replacement policies for multipart systems by a rollout framework. IEEE Trans. Autom. Sci. Eng. 5, 609–619 (2008)
Article Google Scholar
G. Tesauro, G.R. Galperin, On-line policy improvement using Monte Carlo search. Presented at the 1996 neural information processing systems conference, Denver, CO, 1996; also in Advances in Neural Information Processing Systems 9, ed. by M. Mozer et al. (MIT, 1997)
Google Scholar
F. Tu, K.R. Pattipati, Rollout strategies for sequential fault diagnosis. IEEE Trans. Syst. Man Cybern. Part A 33, 86–99 (2003)
Google Scholar
G. Wu, E.K.P. Chong, R.L. Givan, Congestion control using policy rollout, in Proceedings of 2nd IEEE CDC, Maui, HI, 2003, pp. 4825–4830
Google Scholar
X. Yan, P. Diaconis, P. Rusmevichientong, B. Van Roy, Solitaire: man versus machine. Adv. Neural Inf. Process. Syst. 17, 1553–1560 (2005).
Google Scholar

Download references

Author information

Authors and Affiliations

Laboratory for Information and Decision Systems, Massachusetts Institute of Technology, Room 32-660D, 02139, Cambridge, MA, USA
Dimitri P. Bertsekas

Authors

Dimitri P. Bertsekas
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Dimitri P. Bertsekas .

Editor information

Editors and Affiliations

Department of Industrial and Systems Eng, University of Florida, Gainesville, Florida, USA
Panos M. Pardalos
Department of Computer Science, University of Texas, Dallas, Richardson, Texas, USA
Ding-Zhu Du
Dept. Comp. Sci. & Engineering, University of California, San Diego, La Jolla, California, USA
Ronald L. Graham

Rights and permissions

Reprints and permissions

Copyright information

About this entry

Cite this entry

Bertsekas, D.P. (2013). Rollout Algorithms for Discrete Optimization: A Survey. In: Pardalos, P., Du, DZ., Graham, R. (eds) Handbook of Combinatorial Optimization. Springer, New York, NY. https://doi.org/10.1007/978-1-4419-7997-1_8

Download citation

DOI: https://doi.org/10.1007/978-1-4419-7997-1_8
Published: 26 July 2013
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4419-7996-4
Online ISBN: 978-1-4419-7997-1
eBook Packages: Mathematics and StatisticsReference Module Computer Science and Engineering

Publish with us

Policies and ethics