Skip to main content

Approximate Dynamic Programming: Linear Programming-Based Approaches

  • Living reference work entry
  • First Online:
Encyclopedia of Optimization

Introduction

The theory of Markov decision processes (MDPs) provides a mathematical framework for modeling sequential decision problems under uncertainty. Most real-life sequential decision problems can adequately be described by an MDP. The formulation, however, often leads to a large number of states making exact solution methods infeasible in many cases. To overcome this issue, a variety of approximation methods have been referred to as approximate dynamic programming (ADP). These methods can broadly be categorized as simulation-based or linear programming (LP)-based. For simulation-based approaches, we refer to the books of Powell [40] or Bertsekas [10, 11]. In this chapter, we summarize the main ideas of LP-based approximate dynamic programming. This literature is sometimes also referred to as approximate linear programming (ALP), a term coined in [16].

Models

Consider a Markov process \((X_t)_{t\in \mathbb {N}_0}\) that can be in any state \(\mathbf x\in \mathcal X\) with \(\mathc...

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

References

  1. Adelman D (2003) Price-directed replenishment of subsets: methodology and its application to inventory routing. Manuf Serv Oper Manag 5(4):348–371

    Article  Google Scholar 

  2. Adelman D (2004) A price-directed approach to stochastic inventory/routing. Oper Res 52(4):499–514

    Article  MathSciNet  MATH  Google Scholar 

  3. Adelman D (2007) Dynamic bid prices in revenue management. Oper Res 55(4):647–661

    Article  MathSciNet  MATH  Google Scholar 

  4. Adelman D, Barz Ch (2014) A unifying approximate dynamic programming model for the economic lot scheduling problem. Math Oper Res 39(2):374–402

    Article  MathSciNet  MATH  Google Scholar 

  5. Adelman D, Klabjan D (2012) Computing near-optimal policies in generalized joint replenishment. INFORMS J Comput 24(1):148–164

    Article  MathSciNet  MATH  Google Scholar 

  6. Adelman D, Mersereau AJ (2008) Relaxations of weakly coupled stochastic dynamic programs. Oper Res 56(3):712–727

    Article  MathSciNet  MATH  Google Scholar 

  7. Adelman D, Barz Ch, Olivares-Nadal A (2021) Dynamic basis function generation for network revenue management. Working paper

    Google Scholar 

  8. Barz Ch, Kolisch R (2014) Hierarchical multi-skill resource assignment in the telecommunications industry. Prod Oper Manag 23(3):489–503

    Article  Google Scholar 

  9. Barz Ch, Rajaram K (2015) Elective patient admission and scheduling under multiple resource constraints. Prod Oper Manag 24(12):1907–1930

    Article  Google Scholar 

  10. Bertsekas DP (2012) Dynamic programming and optimal control, vol II, 4th edn. Athena Scientific, Nashua NH 03060

    Google Scholar 

  11. Bertsekas DP (2019) Reinforcement learning and optimal control. Athena Scientific optimization and computation series. Athena Scientific, Nashua NH 03060

    Google Scholar 

  12. Bhat N, Farias VF, Moallemi CC (2012) Non-parametric approximate dynamic programming via the kernel method. In: Proceedings of the 25th International Conference on Neural Information Processing Systems, vol 1. NIPS’12, Red Hook. Curran Associates Inc, pp 386–394

    Google Scholar 

  13. Blado D, Hu W, Toriello A (2016) Semi-infinite relaxations for the dynamic knapsack problem with stochastic item sizes. SIAM J Optim 26(3):1625–1648

    Article  MathSciNet  MATH  Google Scholar 

  14. Büyüktahtakin IE (2011) Dynamic programming via linear programming. Wiley, Hoboken, New Jersey, U.S. ISBN 9780470400531

    Google Scholar 

  15. Chen F, Cheng Q, Dong J, Yu Z, Wang G, Xu W (2015) Efficient approximate linear programming for factored MDPs. Int J Approx Reason 63:101–121

    Article  MathSciNet  MATH  Google Scholar 

  16. de Farias DP, Van Roy B (2003) The linear programming approach to approximate dynamic programming. Oper Res 51(6):850–865

    Article  MathSciNet  MATH  Google Scholar 

  17. de Farias DP, Van Roy B (2004) On constraint sampling in the linear programming approach to approximate dynamic programming. Math Oper Res 29(3):462–478

    Article  MathSciNet  MATH  Google Scholar 

  18. Derman C (1962) On sequential decisions and Markov Chains. Manag Sci 9(1):16–24

    Article  MathSciNet  MATH  Google Scholar 

  19. Desai VV, Farias VF, Moallemi CC (2012) Approximate dynamic programming via a smoothed linear program. Oper Res 60(3):655–674

    Article  MathSciNet  MATH  Google Scholar 

  20. Diamant A (2021) Dynamic multistage scheduling for patient-centered care plans. Health Care Manag Sci 24:827–844

    Article  Google Scholar 

  21. Farias VF, Van Roy B (2004) Tetris: experiments with the LP approach to approximate DP. Technical report, Stanford University

    Google Scholar 

  22. Farias VF, Van Roy B (2006) Probabilistic and randomized methods for design under uncertainty, Chapter Tetris: a study of randomized constraint sampling. Springer, London, pp 189–201

    Google Scholar 

  23. Farias VF, Van Roy B (2007) An approximate dynamic programming approach to network revenue management. Preprint

    Google Scholar 

  24. Guestrin C, Koller D, Parr R, Venkataraman S (2003) Efficient solution algorithms for factored MDPs. J Artif Intell Res 19(1):399–468

    Article  MathSciNet  MATH  Google Scholar 

  25. Hauskrecht M, Kveton B (2003) Linear program approximations for factored continuous-state markov decision processes. In: Proceedings of the 16th International Conference on Neural Information Processing Systems, NIPS’03, Cambridge, MA. MIT Press, pp 895–902

    Google Scholar 

  26. Kallenberg LCM Survey of linear programming for standard and nonstandard Markovian control problems, Part I: theory. ZOR – Methods Models Oper Res 40:1–42

    Google Scholar 

  27. Ke J, Zhang D, Zheng H (2019) An approximate dynamic programming approach to dynamic pricing for network revenue management. Prod Oper Manag 28(11):2719–2737

    Article  Google Scholar 

  28. Ke J, Zhang D, Zheng H (2021) Compact reformulations of approximate linear programs for finite-horizon markov decision processes. Working Paper

    Google Scholar 

  29. Klabjan D, Adelman D An infinite-dimensional linear programming algorithm for deterministic semi-Markov decision processes on borel spaces. Math Oper Res 32(3):528–550

    Google Scholar 

  30. Kunnumkal S, Talluri K (2016) Technical note–a note on relaxations of the choice network revenue management dynamic program. Oper Res 64(1):158–166

    Article  MathSciNet  MATH  Google Scholar 

  31. Lin Q, Nadarajah S, Soheili N (2020) Revisiting approximate linear programming: constraint-violation learning with applications to inventory control and energy storage. Manag Sci 66(4):1544–1562

    Article  Google Scholar 

  32. Manne AS (1960) Linear programming and sequential decisions. Manag Sci 6(3):259–267. ISSN 00251909, 15265501

    Google Scholar 

  33. Marquinez JT, Sauré A, Cataldo A, Ferrer J-C (2021) Identifying proactive ICU patient admission, transfer and diversion policies in a public-private hospital network. Eur J Oper Res 295(1):306–320

    Article  MathSciNet  MATH  Google Scholar 

  34. Meissner J, Strauss A (2012) Network revenue management with inventory-sensitive bid prices and customer choice. Eur J Oper Res 216(2):459–468

    Article  MathSciNet  MATH  Google Scholar 

  35. Nadarajah S, Margot F, Secomandi N (2015) Relaxations of approximate linear programs for the real option management of commodity storage. Manag Sci 61(12):3054–3076

    Article  Google Scholar 

  36. Osaki S, Mine H (1968) Linear programming algorithms for semi-Markovian decision processes. J Math Anal Appl (22):356–381

    Article  MathSciNet  MATH  Google Scholar 

  37. Pakiman P, Nadarajah S, Soheili N, Lin Q (2021) Self-guided approximate linear programs. arXiv:2001.02798v2

    Google Scholar 

  38. Patrick J, Puterman ML, Queyranne M (2008) Dynamic multipriority patient scheduling for a diagnostic resource. Oper Res 56(6):1507–1525

    Article  MathSciNet  MATH  Google Scholar 

  39. Petrik M, Taylor G, Parr R, Zilberstein S (2010) Feature selection using regularization in approximate linear programs for Markov decision processes. arXiv:1005.1860v2

    Google Scholar 

  40. Powell WB (2011) Approximate dynamic programming: solving the curses of dimensionality. Wiley series in probability and statistics, 2nd edn. Wiley, Hoboken, New Jersey, U.S.

    Book  Google Scholar 

  41. Puterman ML (2014) Markov decision processes: discrete stochastic dynamic programming. Wiley series in probability and statistics. Wiley, Hoboken, New Jersey, U.S.

    Google Scholar 

  42. Restrepo M (2008) Computational methods for static allocation and real-time redeployment of ambulances. PhD thesis, Cornell University

    Google Scholar 

  43. Sauré A, Patrick J, Tyldesley S, Puterman ML (2012) Dynamic multi-appointment patient scheduling for radiation therapy. Eur J Oper Res 223(2):573–584

    Article  MathSciNet  MATH  Google Scholar 

  44. Schweitzer PJ, Seidmann A Generalized polynomial approximations in Markovian decision processes. J Math Anal Appl 110:568–582

    Google Scholar 

  45. Tong Ch, Topaloglu H (2014) On the approximate linear programming approach for network revenue management problems. INFORMS J Comput 26(1):121–134

    Article  MathSciNet  MATH  Google Scholar 

  46. Topaloglu H, Kunnumkal S (2006) Approximate dynamic programming methods for an inventory allocation problem under uncertainty. Nav Res Logist (NRL) 53(8):822–841

    Article  MathSciNet  MATH  Google Scholar 

  47. Toriello A (2014) Optimal toll design: a lower bound framework for the asymmetric traveling salesman problem. Math Prog 144(1–2):247–264

    Article  MathSciNet  MATH  Google Scholar 

  48. Toriello A, Haskell WB, Poremba M (2014) A dynamic traveling salesman problem with stochastic arc costs. Oper Res 62(5):1107–1125

    Article  MathSciNet  MATH  Google Scholar 

  49. Trick M, Zin S (1993) A linear programming approach to solving stochastic dynamic programming. GSIA working papers, Carnegie Mellon University, Tepper School of Business

    Google Scholar 

  50. Trick MA, Zin SE (1997) Spline approximations to value functions: linear programming approach. Macroecon Dyn 1(1):255–277

    Article  MATH  Google Scholar 

  51. Veatch MH (2015) Approximate linear programming for networks: average cost bounds. Comput Oper Res 63:32–45

    Article  MathSciNet  MATH  Google Scholar 

  52. Vossen ThWM, Zhang D (2015) Reductions of approximate linear progra ms for network revenue management. Oper Res 63(6):1352–1371

    Article  MathSciNet  MATH  Google Scholar 

  53. Wolfe P, Dantzig GB (1962) Linear programming in a Markov Chain. Oper Res 10(5):702–710

    Article  MathSciNet  MATH  Google Scholar 

  54. Zhang D, Adelman D (2009) An approximate dynamic programming approach to network revenue management with customer choice. Transp Sci 43(3):381–394

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Christiane Barz .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 Springer Nature Switzerland AG

About this entry

Check for updates. Verify currency and authenticity via CrossMark

Cite this entry

Barz, C., Glanzer, M. (2023). Approximate Dynamic Programming: Linear Programming-Based Approaches. In: Pardalos, P.M., Prokopyev, O.A. (eds) Encyclopedia of Optimization. Springer, Cham. https://doi.org/10.1007/978-3-030-54621-2_818-1

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-54621-2_818-1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-54621-2

  • Online ISBN: 978-3-030-54621-2

  • eBook Packages: Springer Reference MathematicsReference Module Computer Science and Engineering

Publish with us

Policies and ethics