Approximate Dynamic Programming: Linear Programming-Based Approaches

Barz, Christiane; Glanzer, Martin

doi:10.1007/978-3-030-54621-2_818-1

Christiane Barz³ &
Martin Glanzer³

50 Accesses

Introduction

The theory of Markov decision processes (MDPs) provides a mathematical framework for modeling sequential decision problems under uncertainty. Most real-life sequential decision problems can adequately be described by an MDP. The formulation, however, often leads to a large number of states making exact solution methods infeasible in many cases. To overcome this issue, a variety of approximation methods have been referred to as approximate dynamic programming (ADP). These methods can broadly be categorized as simulation-based or linear programming (LP)-based. For simulation-based approaches, we refer to the books of Powell [40] or Bertsekas [10, 11]. In this chapter, we summarize the main ideas of LP-based approximate dynamic programming. This literature is sometimes also referred to as approximate linear programming (ALP), a term coined in [16].

Models

Consider a Markov process \((X_t)_{t\in \mathbb {N}_0}\) that can be in any state \(\mathbf x\in \mathcal X\) with \(\mathc...

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

References

Adelman D (2003) Price-directed replenishment of subsets: methodology and its application to inventory routing. Manuf Serv Oper Manag 5(4):348–371
Article Google Scholar
Adelman D (2004) A price-directed approach to stochastic inventory/routing. Oper Res 52(4):499–514
Article MathSciNet MATH Google Scholar
Adelman D (2007) Dynamic bid prices in revenue management. Oper Res 55(4):647–661
Article MathSciNet MATH Google Scholar
Adelman D, Barz Ch (2014) A unifying approximate dynamic programming model for the economic lot scheduling problem. Math Oper Res 39(2):374–402
Article MathSciNet MATH Google Scholar
Adelman D, Klabjan D (2012) Computing near-optimal policies in generalized joint replenishment. INFORMS J Comput 24(1):148–164
Article MathSciNet MATH Google Scholar
Adelman D, Mersereau AJ (2008) Relaxations of weakly coupled stochastic dynamic programs. Oper Res 56(3):712–727
Article MathSciNet MATH Google Scholar
Adelman D, Barz Ch, Olivares-Nadal A (2021) Dynamic basis function generation for network revenue management. Working paper
Google Scholar
Barz Ch, Kolisch R (2014) Hierarchical multi-skill resource assignment in the telecommunications industry. Prod Oper Manag 23(3):489–503
Article Google Scholar
Barz Ch, Rajaram K (2015) Elective patient admission and scheduling under multiple resource constraints. Prod Oper Manag 24(12):1907–1930
Article Google Scholar
Bertsekas DP (2012) Dynamic programming and optimal control, vol II, 4th edn. Athena Scientific, Nashua NH 03060
Google Scholar
Bertsekas DP (2019) Reinforcement learning and optimal control. Athena Scientific optimization and computation series. Athena Scientific, Nashua NH 03060
Google Scholar
Bhat N, Farias VF, Moallemi CC (2012) Non-parametric approximate dynamic programming via the kernel method. In: Proceedings of the 25th International Conference on Neural Information Processing Systems, vol 1. NIPS’12, Red Hook. Curran Associates Inc, pp 386–394
Google Scholar
Blado D, Hu W, Toriello A (2016) Semi-infinite relaxations for the dynamic knapsack problem with stochastic item sizes. SIAM J Optim 26(3):1625–1648
Article MathSciNet MATH Google Scholar
Büyüktahtakin IE (2011) Dynamic programming via linear programming. Wiley, Hoboken, New Jersey, U.S. ISBN 9780470400531
Google Scholar
Chen F, Cheng Q, Dong J, Yu Z, Wang G, Xu W (2015) Efficient approximate linear programming for factored MDPs. Int J Approx Reason 63:101–121
Article MathSciNet MATH Google Scholar
de Farias DP, Van Roy B (2003) The linear programming approach to approximate dynamic programming. Oper Res 51(6):850–865
Article MathSciNet MATH Google Scholar
de Farias DP, Van Roy B (2004) On constraint sampling in the linear programming approach to approximate dynamic programming. Math Oper Res 29(3):462–478
Article MathSciNet MATH Google Scholar
Derman C (1962) On sequential decisions and Markov Chains. Manag Sci 9(1):16–24
Article MathSciNet MATH Google Scholar
Desai VV, Farias VF, Moallemi CC (2012) Approximate dynamic programming via a smoothed linear program. Oper Res 60(3):655–674
Article MathSciNet MATH Google Scholar
Diamant A (2021) Dynamic multistage scheduling for patient-centered care plans. Health Care Manag Sci 24:827–844
Article Google Scholar
Farias VF, Van Roy B (2004) Tetris: experiments with the LP approach to approximate DP. Technical report, Stanford University
Google Scholar
Farias VF, Van Roy B (2006) Probabilistic and randomized methods for design under uncertainty, Chapter Tetris: a study of randomized constraint sampling. Springer, London, pp 189–201
Google Scholar
Farias VF, Van Roy B (2007) An approximate dynamic programming approach to network revenue management. Preprint
Google Scholar
Guestrin C, Koller D, Parr R, Venkataraman S (2003) Efficient solution algorithms for factored MDPs. J Artif Intell Res 19(1):399–468
Article MathSciNet MATH Google Scholar
Hauskrecht M, Kveton B (2003) Linear program approximations for factored continuous-state markov decision processes. In: Proceedings of the 16th International Conference on Neural Information Processing Systems, NIPS’03, Cambridge, MA. MIT Press, pp 895–902
Google Scholar
Kallenberg LCM Survey of linear programming for standard and nonstandard Markovian control problems, Part I: theory. ZOR – Methods Models Oper Res 40:1–42
Google Scholar
Ke J, Zhang D, Zheng H (2019) An approximate dynamic programming approach to dynamic pricing for network revenue management. Prod Oper Manag 28(11):2719–2737
Article Google Scholar
Ke J, Zhang D, Zheng H (2021) Compact reformulations of approximate linear programs for finite-horizon markov decision processes. Working Paper
Google Scholar
Klabjan D, Adelman D An infinite-dimensional linear programming algorithm for deterministic semi-Markov decision processes on borel spaces. Math Oper Res 32(3):528–550
Google Scholar
Kunnumkal S, Talluri K (2016) Technical note–a note on relaxations of the choice network revenue management dynamic program. Oper Res 64(1):158–166
Article MathSciNet MATH Google Scholar
Lin Q, Nadarajah S, Soheili N (2020) Revisiting approximate linear programming: constraint-violation learning with applications to inventory control and energy storage. Manag Sci 66(4):1544–1562
Article Google Scholar
Manne AS (1960) Linear programming and sequential decisions. Manag Sci 6(3):259–267. ISSN 00251909, 15265501
Google Scholar
Marquinez JT, Sauré A, Cataldo A, Ferrer J-C (2021) Identifying proactive ICU patient admission, transfer and diversion policies in a public-private hospital network. Eur J Oper Res 295(1):306–320
Article MathSciNet MATH Google Scholar
Meissner J, Strauss A (2012) Network revenue management with inventory-sensitive bid prices and customer choice. Eur J Oper Res 216(2):459–468
Article MathSciNet MATH Google Scholar
Nadarajah S, Margot F, Secomandi N (2015) Relaxations of approximate linear programs for the real option management of commodity storage. Manag Sci 61(12):3054–3076
Article Google Scholar
Osaki S, Mine H (1968) Linear programming algorithms for semi-Markovian decision processes. J Math Anal Appl (22):356–381
Article MathSciNet MATH Google Scholar
Pakiman P, Nadarajah S, Soheili N, Lin Q (2021) Self-guided approximate linear programs. arXiv:2001.02798v2
Google Scholar
Patrick J, Puterman ML, Queyranne M (2008) Dynamic multipriority patient scheduling for a diagnostic resource. Oper Res 56(6):1507–1525
Article MathSciNet MATH Google Scholar
Petrik M, Taylor G, Parr R, Zilberstein S (2010) Feature selection using regularization in approximate linear programs for Markov decision processes. arXiv:1005.1860v2
Google Scholar
Powell WB (2011) Approximate dynamic programming: solving the curses of dimensionality. Wiley series in probability and statistics, 2nd edn. Wiley, Hoboken, New Jersey, U.S.
Book Google Scholar
Puterman ML (2014) Markov decision processes: discrete stochastic dynamic programming. Wiley series in probability and statistics. Wiley, Hoboken, New Jersey, U.S.
Google Scholar
Restrepo M (2008) Computational methods for static allocation and real-time redeployment of ambulances. PhD thesis, Cornell University
Google Scholar
Sauré A, Patrick J, Tyldesley S, Puterman ML (2012) Dynamic multi-appointment patient scheduling for radiation therapy. Eur J Oper Res 223(2):573–584
Article MathSciNet MATH Google Scholar
Schweitzer PJ, Seidmann A Generalized polynomial approximations in Markovian decision processes. J Math Anal Appl 110:568–582
Google Scholar
Tong Ch, Topaloglu H (2014) On the approximate linear programming approach for network revenue management problems. INFORMS J Comput 26(1):121–134
Article MathSciNet MATH Google Scholar
Topaloglu H, Kunnumkal S (2006) Approximate dynamic programming methods for an inventory allocation problem under uncertainty. Nav Res Logist (NRL) 53(8):822–841
Article MathSciNet MATH Google Scholar
Toriello A (2014) Optimal toll design: a lower bound framework for the asymmetric traveling salesman problem. Math Prog 144(1–2):247–264
Article MathSciNet MATH Google Scholar
Toriello A, Haskell WB, Poremba M (2014) A dynamic traveling salesman problem with stochastic arc costs. Oper Res 62(5):1107–1125
Article MathSciNet MATH Google Scholar
Trick M, Zin S (1993) A linear programming approach to solving stochastic dynamic programming. GSIA working papers, Carnegie Mellon University, Tepper School of Business
Google Scholar
Trick MA, Zin SE (1997) Spline approximations to value functions: linear programming approach. Macroecon Dyn 1(1):255–277
Article MATH Google Scholar
Veatch MH (2015) Approximate linear programming for networks: average cost bounds. Comput Oper Res 63:32–45
Article MathSciNet MATH Google Scholar
Vossen ThWM, Zhang D (2015) Reductions of approximate linear progra ms for network revenue management. Oper Res 63(6):1352–1371
Article MathSciNet MATH Google Scholar
Wolfe P, Dantzig GB (1962) Linear programming in a Markov Chain. Oper Res 10(5):702–710
Article MathSciNet MATH Google Scholar
Zhang D, Adelman D (2009) An approximate dynamic programming approach to network revenue management with customer choice. Transp Sci 43(3):381–394
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Business Administration, University of Zurich, Zurich, Switzerland
Christiane Barz & Martin Glanzer

Authors

Christiane Barz
View author publications
You can also search for this author in PubMed Google Scholar
Martin Glanzer
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Christiane Barz .

Editor information

Editors and Affiliations

Department of Industrial & Systems Engin, University of Florida, Gainesville, FL, USA
Panos M. Pardalos
Departmentl of Industrial Engineering, University of Pittsburgh, Pittsburgh, PA, USA
Oleg A. Prokopyev

Rights and permissions

Reprints and permissions

Copyright information

About this entry

Cite this entry

Barz, C., Glanzer, M. (2023). Approximate Dynamic Programming: Linear Programming-Based Approaches. In: Pardalos, P.M., Prokopyev, O.A. (eds) Encyclopedia of Optimization. Springer, Cham. https://doi.org/10.1007/978-3-030-54621-2_818-1

Download citation

DOI: https://doi.org/10.1007/978-3-030-54621-2_818-1
Published: 21 September 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-54621-2
Online ISBN: 978-3-030-54621-2
eBook Packages: Springer Reference MathematicsReference Module Computer Science and Engineering

Publish with us

Policies and ethics