Skip to main content

Advertisement

Log in

On the optimality equation for average cost Markov decision processes and its validity for inventory control

  • Feinberg: Probability
  • Published:
Annals of Operations Research Aims and scope Submit manuscript

Abstract

As is well known, average-cost optimality inequalities imply the existence of stationary optimal policies for Markov decision processes with average costs per unit time, and these inequalities hold under broad natural conditions. This paper provides sufficient conditions for the validity of the average-cost optimality equation for an infinite state problem with weakly continuous transition probabilities and with possibly unbounded one-step costs and noncompact action sets. These conditions also imply the convergence of sequences of discounted relative value functions to average-cost relative value functions and the continuity of average-cost relative value functions. As shown in this paper, the classic periodic-review setup-cost inventory control problem with backorders and convex holding/backlog costs satisfies these conditions. Therefore, the optimality inequality holds in the form of an equality with a continuous average-cost relative value function for this problem. In addition, the K-convexity of discounted relative value functions and their convergence to average-cost relative value functions, when the discount factor increases to 1, imply the K-convexity of average-cost relative value functions. This implies that average-cost optimal (sS) policies for the inventory control problem can be derived from the average-cost optimality equation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Bensoussan, A. (2011). Dynamic programming and inventory control. Amsterdam: IOS Press.

    Google Scholar 

  • Bertsekas, D. P., & Shreve, S. E. (1996). Stochastic optimal control: The discrete-time case. Belmont, MA: Athena Scientific.

    Google Scholar 

  • Beyer, D., Cheng, F., Sethi, S. P., & Taksar, M. (2010). Markovian demand inventory models. New York: Springer.

    Book  Google Scholar 

  • Beyer, D., & Sethi, S. P. (1999). The classical average-cost inventory models of Iglehart and Veinott–Wagner revisited. Journal of Optimization Theory and Applications, 101(3), 523–555.

    Article  Google Scholar 

  • Cavazos-Cadena, R. (1991). A counterexample on the optimality equation in markov decision chains with the average cost criterion. System and Control Letters, 16(5), 387–392.

    Article  Google Scholar 

  • Chen, X., & Simchi-Levi, D. (2004a). Coordinating inventory control and pricing strategies with random demand and fixed ordering cost: The finite horizon case. Operations Research, 52(6), 887–896.

    Article  Google Scholar 

  • Chen, X., & Simchi-Levi, D. (2004b). Coordinating inventory control and pricing strategies with random demand and fixed ordering cost: The infinite horizon case. Mathematics of Operations Research, 29(3), 698–723.

    Article  Google Scholar 

  • Chen, X., & Simchi-Levi, D. (2006). Coordinating inventory control and pricing strategies: The continuous review model. Operations Research Letters, 34(3), 323–332.

    Article  Google Scholar 

  • Costa, O. L. V., & Dufour, F. (2012). Average control of Markov decision processes with Feller transition probabilities and general action spaces. Journal of Mathematical Analysis and Applications, 396(1), 58–69.

    Article  Google Scholar 

  • Feinberg, E. A. (2016). Optimality conditions for inventory control. In A. Gupta & A. Capponi (Eds.), Tutorials in operations research. Optimization challenges in complex, networked, and risky systems (pp. 14–44). Cantonsville, MD: INFORMS.

    Google Scholar 

  • Feinberg, E. A., Kasyanov, P. O., & Zadoianchuk, N. V. (2012). Average cost Markov decision processes with weakly continuous transition probability. Mathematics of Operations Research, 37(4), 591–607.

    Article  Google Scholar 

  • Feinberg, E. A., Kasyanov, P. O., & Zadoianchuk, N. V. (2013). Berge’s theorem for noncompact image sets. Journal of Mathematical Analysis and Applications, 397(1), 255–259.

    Article  Google Scholar 

  • Feinberg, E. A., Kasyanov, P. O., & Zgurovsky, M. Z. (2016). Partially observable total-cost Markov decision processes with weakly continuous transition probabilities. Mathematics of Operations Research, 41(2), 656–681.

    Article  Google Scholar 

  • Feinberg, E. A., & Lewis, M. E. (2007). Optimality inequalities for average cost Markov decision processes and the stochastic cash balance problem. Mathematics of Operations Research, 32(4), 769–783.

    Article  Google Scholar 

  • Feinberg, E. A., & Lewis, M. E. (2015). On the convergence of optimal actions for Markov decision processes and the optimality of (s, S) policies for inventory control. Preprint arXiv:1507.05125. http://arxiv.org/pdf/1507.05125.pdf.

  • Feinberg, E. A., & Liang, Y. (2017a). Structure of optimal policies to periodic-review inventory models with convex costs and backorders for all values of discount factors. Annals of Operations Research. doi:10.1007/s10479-017-2548-6.

  • Feinberg, E. A., & Liang, Y. (2017b). Stochastic setup-cost inventory model with backorders and quasiconvex cost functions. Preprint arXiv:1705.06814. http://arxiv.org/pdf/1705.06814.pdf.

  • Hernández-Lerma, O., & Lasserre, J. B. (1996). Discrete-time Markov control processes: Basic optimality criteria. New York: Springer.

    Book  Google Scholar 

  • Hiriart-Urruty, J.-B., & Lemaréchal, C. (1993). Convex analysis and minimization algorithms I. Berlin: Springer.

    Book  Google Scholar 

  • Iglehart, D. L. (1963). Dynamic programming and stationary analysis of inventory roblems. In H. Scarf, D. Gilford, & M. Shelly (Eds.), Multistage inventory control models and techniques (pp. 1–31). Stanford, CA: Stanford University Press.

    Google Scholar 

  • Jaśkiewicz, A., & Nowak, A. S. (2006). On the optimality equation for average cost Markov control processes with Feller transition probabilities. Journal of Mathematical Analysis and Applications, 316(2), 495–509.

    Article  Google Scholar 

  • Katehakis, M. N., & Smit, L. C. (2012). On computing optimal (Q, r) replenishment policies under quantity discounts. Annals of Operations Research, 200(1), 279–298.

    Article  Google Scholar 

  • Luque-Vasques, F., & Hernández-Lerma, O. (1995). A counterexample on the semicontinuity of minima. Proceedings of the American Mathematical Society, 123(10), 3175–3176.

    Article  Google Scholar 

  • Montes-de-Oca, R. (1994). The average cost optimality equation for Markov control processes on Borel spaces. Systems and Control Letters, 22(5), 251–357.

    Article  Google Scholar 

  • Presman, E., & Sethi, S. P. (2006). Inventory models with continuous and poisson demands and discounted and average costs. Production and Operations Management, 15(2), 279–293.

    Article  Google Scholar 

  • Resnick, S. I. (1992). Adventures in Stochastic Processes. Boston: Birkhauser.

    Google Scholar 

  • Schäl, M. (1993). Average optimality in dynamic programming with general state space. Mathematics of Operations Research, 18(1), 163–172.

    Article  Google Scholar 

  • Sennott, L. I. (1998). Stochastic dynamic programming and the control of queueing systems. New York: Wiley.

    Book  Google Scholar 

  • Sennott, L. I. (2002). Average reward optimization theory for denumerable state systems. In E. A. Feinberg & A. Shwartz (Eds.), Handbook of Markov decision processes: Methods and applications (pp. 153–173). Boston, MA: Kluwer.

    Chapter  Google Scholar 

  • Shi, J., Katehakis, M. N., & Melamed, B. (2013). Martingale methods for pricing inventory penalties under continuous replenishment and compound renewal demands. Annals of Operations Research, 208(1), 593–612.

    Article  Google Scholar 

  • Veinott, A. F., & Wagner, H. M. (1965). Computing optimal (s, S) policies. Management Science, 11(5), 525–552.

    Article  Google Scholar 

  • Zheng, Y. (1991). A simple proof for optimality of (s, S) policies in infinite-horizon inventory systems. Journal of Applied Probability, 28(4), 802–810.

    Article  Google Scholar 

Download references

Acknowledgements

This research was partially supported by NSF Grants CMMI-1335296 and CMMI-1636193.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Eugene A. Feinberg.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Feinberg, E.A., Liang, Y. On the optimality equation for average cost Markov decision processes and its validity for inventory control. Ann Oper Res 317, 569–586 (2022). https://doi.org/10.1007/s10479-017-2561-9

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10479-017-2561-9

Keywords

Navigation