On the optimality equation for average cost Markov decision processes and its validity for inventory control

Feinberg, Eugene A.; Liang, Yan

doi:10.1007/s10479-017-2561-9

On the optimality equation for average cost Markov decision processes and its validity for inventory control

Feinberg: Probability
Published: 22 June 2017

Volume 317, pages 569–586, (2022)
Cite this article

Annals of Operations Research Aims and scope Submit manuscript

Eugene A. Feinberg¹ &
Yan Liang¹

483 Accesses
8 Citations
Explore all metrics

Abstract

As is well known, average-cost optimality inequalities imply the existence of stationary optimal policies for Markov decision processes with average costs per unit time, and these inequalities hold under broad natural conditions. This paper provides sufficient conditions for the validity of the average-cost optimality equation for an infinite state problem with weakly continuous transition probabilities and with possibly unbounded one-step costs and noncompact action sets. These conditions also imply the convergence of sequences of discounted relative value functions to average-cost relative value functions and the continuity of average-cost relative value functions. As shown in this paper, the classic periodic-review setup-cost inventory control problem with backorders and convex holding/backlog costs satisfies these conditions. Therefore, the optimality inequality holds in the form of an equality with a continuous average-cost relative value function for this problem. In addition, the K-convexity of discounted relative value functions and their convergence to average-cost relative value functions, when the discount factor increases to 1, imply the K-convexity of average-cost relative value functions. This implies that average-cost optimal (s, S) policies for the inventory control problem can be derived from the average-cost optimality equation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The residual time approach for (Q, r) model under perishability, general lead times, and lost sales

Article 31 July 2020

Structure of optimal policies to periodic-review inventory models with convex costs and backorders for all values of discount factors

Article 17 June 2017

Stochastic Inventory Models

References

Bensoussan, A. (2011). Dynamic programming and inventory control. Amsterdam: IOS Press.
Google Scholar
Bertsekas, D. P., & Shreve, S. E. (1996). Stochastic optimal control: The discrete-time case. Belmont, MA: Athena Scientific.
Google Scholar
Beyer, D., Cheng, F., Sethi, S. P., & Taksar, M. (2010). Markovian demand inventory models. New York: Springer.
Book Google Scholar
Beyer, D., & Sethi, S. P. (1999). The classical average-cost inventory models of Iglehart and Veinott–Wagner revisited. Journal of Optimization Theory and Applications, 101(3), 523–555.
Article Google Scholar
Cavazos-Cadena, R. (1991). A counterexample on the optimality equation in markov decision chains with the average cost criterion. System and Control Letters, 16(5), 387–392.
Article Google Scholar
Chen, X., & Simchi-Levi, D. (2004a). Coordinating inventory control and pricing strategies with random demand and fixed ordering cost: The finite horizon case. Operations Research, 52(6), 887–896.
Article Google Scholar
Chen, X., & Simchi-Levi, D. (2004b). Coordinating inventory control and pricing strategies with random demand and fixed ordering cost: The infinite horizon case. Mathematics of Operations Research, 29(3), 698–723.
Article Google Scholar
Chen, X., & Simchi-Levi, D. (2006). Coordinating inventory control and pricing strategies: The continuous review model. Operations Research Letters, 34(3), 323–332.
Article Google Scholar
Costa, O. L. V., & Dufour, F. (2012). Average control of Markov decision processes with Feller transition probabilities and general action spaces. Journal of Mathematical Analysis and Applications, 396(1), 58–69.
Article Google Scholar
Feinberg, E. A. (2016). Optimality conditions for inventory control. In A. Gupta & A. Capponi (Eds.), Tutorials in operations research. Optimization challenges in complex, networked, and risky systems (pp. 14–44). Cantonsville, MD: INFORMS.
Google Scholar
Feinberg, E. A., Kasyanov, P. O., & Zadoianchuk, N. V. (2012). Average cost Markov decision processes with weakly continuous transition probability. Mathematics of Operations Research, 37(4), 591–607.
Article Google Scholar
Feinberg, E. A., Kasyanov, P. O., & Zadoianchuk, N. V. (2013). Berge’s theorem for noncompact image sets. Journal of Mathematical Analysis and Applications, 397(1), 255–259.
Article Google Scholar
Feinberg, E. A., Kasyanov, P. O., & Zgurovsky, M. Z. (2016). Partially observable total-cost Markov decision processes with weakly continuous transition probabilities. Mathematics of Operations Research, 41(2), 656–681.
Article Google Scholar
Feinberg, E. A., & Lewis, M. E. (2007). Optimality inequalities for average cost Markov decision processes and the stochastic cash balance problem. Mathematics of Operations Research, 32(4), 769–783.
Article Google Scholar
Feinberg, E. A., & Lewis, M. E. (2015). On the convergence of optimal actions for Markov decision processes and the optimality of (s, S) policies for inventory control. Preprint arXiv:1507.05125. http://arxiv.org/pdf/1507.05125.pdf.
Feinberg, E. A., & Liang, Y. (2017a). Structure of optimal policies to periodic-review inventory models with convex costs and backorders for all values of discount factors. Annals of Operations Research. doi:10.1007/s10479-017-2548-6.
Feinberg, E. A., & Liang, Y. (2017b). Stochastic setup-cost inventory model with backorders and quasiconvex cost functions. Preprint arXiv:1705.06814. http://arxiv.org/pdf/1705.06814.pdf.
Hernández-Lerma, O., & Lasserre, J. B. (1996). Discrete-time Markov control processes: Basic optimality criteria. New York: Springer.
Book Google Scholar
Hiriart-Urruty, J.-B., & Lemaréchal, C. (1993). Convex analysis and minimization algorithms I. Berlin: Springer.
Book Google Scholar
Iglehart, D. L. (1963). Dynamic programming and stationary analysis of inventory roblems. In H. Scarf, D. Gilford, & M. Shelly (Eds.), Multistage inventory control models and techniques (pp. 1–31). Stanford, CA: Stanford University Press.
Google Scholar
Jaśkiewicz, A., & Nowak, A. S. (2006). On the optimality equation for average cost Markov control processes with Feller transition probabilities. Journal of Mathematical Analysis and Applications, 316(2), 495–509.
Article Google Scholar
Katehakis, M. N., & Smit, L. C. (2012). On computing optimal (Q, r) replenishment policies under quantity discounts. Annals of Operations Research, 200(1), 279–298.
Article Google Scholar
Luque-Vasques, F., & Hernández-Lerma, O. (1995). A counterexample on the semicontinuity of minima. Proceedings of the American Mathematical Society, 123(10), 3175–3176.
Article Google Scholar
Montes-de-Oca, R. (1994). The average cost optimality equation for Markov control processes on Borel spaces. Systems and Control Letters, 22(5), 251–357.
Article Google Scholar
Presman, E., & Sethi, S. P. (2006). Inventory models with continuous and poisson demands and discounted and average costs. Production and Operations Management, 15(2), 279–293.
Article Google Scholar
Resnick, S. I. (1992). Adventures in Stochastic Processes. Boston: Birkhauser.
Google Scholar
Schäl, M. (1993). Average optimality in dynamic programming with general state space. Mathematics of Operations Research, 18(1), 163–172.
Article Google Scholar
Sennott, L. I. (1998). Stochastic dynamic programming and the control of queueing systems. New York: Wiley.
Book Google Scholar
Sennott, L. I. (2002). Average reward optimization theory for denumerable state systems. In E. A. Feinberg & A. Shwartz (Eds.), Handbook of Markov decision processes: Methods and applications (pp. 153–173). Boston, MA: Kluwer.
Chapter Google Scholar
Shi, J., Katehakis, M. N., & Melamed, B. (2013). Martingale methods for pricing inventory penalties under continuous replenishment and compound renewal demands. Annals of Operations Research, 208(1), 593–612.
Article Google Scholar
Veinott, A. F., & Wagner, H. M. (1965). Computing optimal (s, S) policies. Management Science, 11(5), 525–552.
Article Google Scholar
Zheng, Y. (1991). A simple proof for optimality of (s, S) policies in infinite-horizon inventory systems. Journal of Applied Probability, 28(4), 802–810.
Article Google Scholar

Download references

Acknowledgements

This research was partially supported by NSF Grants CMMI-1335296 and CMMI-1636193.

Author information

Authors and Affiliations

Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, NY, 11794, USA
Eugene A. Feinberg & Yan Liang

Authors

Eugene A. Feinberg
View author publications
You can also search for this author in PubMed Google Scholar
Yan Liang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Eugene A. Feinberg.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Feinberg, E.A., Liang, Y. On the optimality equation for average cost Markov decision processes and its validity for inventory control. Ann Oper Res 317, 569–586 (2022). https://doi.org/10.1007/s10479-017-2561-9

Download citation

Published: 22 June 2017
Issue Date: October 2022
DOI: https://doi.org/10.1007/s10479-017-2561-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

On the optimality equation for average cost Markov decision processes and its validity for inventory control

Abstract

Access this article

Similar content being viewed by others

The residual time approach for (Q, r) model under perishability, general lead times, and lost sales

Structure of optimal policies to periodic-review inventory models with convex costs and backorders for all values of discount factors

Stochastic Inventory Models

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

On the optimality equation for average cost Markov decision processes and its validity for inventory control

Abstract

Access this article

Similar content being viewed by others

The residual time approach for (Q, r) model under perishability, general lead times, and lost sales

Structure of optimal policies to periodic-review inventory models with convex costs and backorders for all values of discount factors

Stochastic Inventory Models

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation