Skip to main content
Log in

Suboptimal policy determination for large-scale Markov decision processes, Part 1: Description and bounds

  • Contributed Papers
  • Published:
Journal of Optimization Theory and Applications Aims and scope Submit manuscript

Abstract

This paper is the first of two papers that present and evaluate an approach for determining suboptimal policies for large-scale Markov decision processes (MDP). Part 1 is devoted to the determination of bounds that motivate the development and indicate the quality of the suboptimal design approach; Part 2 is concerned with the implementation and evaluation of the suboptimal design approach. The specific MDP considered is the infinite-horizon, expected total discounted cost MDP with finite state and action spaces. The approach can be described as follows. First, the original MDP is approximated by a specially structured MDP. The special structure suggests how to construct associated smaller, more computationally tractable MDP's. The suboptimal policy for the original MDP is then constructed from the solutions of these smaller MDP's. The key feature of this approach is that the state and action space cardinalities of the smaller MDP's are exponential reductions of the state and action space cardinalities of the original MDP.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Pierskalla, W. P., andVoeller, J. A.,A Survey of Maintenance Models: The Control and Surveillance of Deteriorating Systems, Naval Research Logistics Quarterly, Vol. 23, pp. 353–358, 1976.

    Google Scholar 

  2. Ross, S. M.,Quality Control under Markovian Deterioration, Management Science, Vol. 17, pp. 587–596, 1971.

    Google Scholar 

  3. Derman, C.,On Optimal Replacement Rules When Changes of State Are Markovian, Mathematical Optimization Techniques, Edited by R. Bellman, University of California Press, Berkeley, California, 1963.

    Google Scholar 

  4. Lembersky, M. R.,The Application of Markov Decision Processes to Forest Management, Dynamic Programming and Its Applications, Edited by M. Puterman, Academic Press, New York, pp. 207–219, 1978.

    Google Scholar 

  5. Lipstein, B.,A Mathematical Model of Consumer Behavior, Journal of Marketing Research, Vol. 11, pp. 259–265, 1965.

    Google Scholar 

  6. Rothstein, M.,Hotel Overbooking as a Markovian Sequential Decision Process, Decision Sciences, Vol. 5, pp. 389–404, 1974.

    Google Scholar 

  7. Shoemaker, C. A.,Applications of Dynamic Programming and Other Optimization Methods in Pest Management, IEEE Transactions on Automatic Control, Vol. AC-26, pp. 1125–1132, 1981.

    Google Scholar 

  8. Shoemaker, C. A.,Optimal Integrated Control of Univoltine Pest Populations with Age Structure, Operations Research, Vol. 30, pp. 40–61, 1982.

    Google Scholar 

  9. Varaiya, P., Schweitzer, P. J., andHartwick, J.,A Class of Markovian Problems Related to the Districting Problem for Urban Emergency Services, Ricerche di Automatica, and Vol. 8, pp. 1–19, 1977.

    Google Scholar 

  10. Porteus, E. L.,Overview of Iterative Methods for Discounted Finite Markov and Semi-Markov Decision Chains, Recent Developments in Markov Decision Processes, Edited by R. Hartley, L. C. Thomas, and D. J. White, Academic Press, London, England, pp. 1–20, 1980.

    Google Scholar 

  11. Platzman, L. K., White, C. C., andPopyack, J. L.,Optimally Damped Successive Approximation Algorithms for Markov Decision Programming (to appear).

  12. Whitt, W.,Approximations of Dynamic Programs, I, Mathematics for Operations Research, Vol. 3, pp. 231–243, 1978.

    Google Scholar 

  13. Whitt, W.,Approximations of Dynamic Programs, II, Mathematics for Operations Research, Vol. 4, pp. 179–185, 1979.

    Google Scholar 

  14. Mendelssohn, R. A.,An Iterative Aggregation Procedure for Markov Decision Processes, Operations Research, Vol. 30, pp. 62–73, 1982.

    Google Scholar 

  15. Schweitzer, P. J.,A Survey of Aggregation/Disaggregation Methods in Markov Decision Programming, Proceedings of the 19th IEEE Conference on Decision and Control, 1980.

  16. White, D. J.,Finite State Approximations for Denumerable State Infinite Horizon Discounted Markov Decision Processes: The Method of Successive Approximations, Recent Developments in Markov Decision Processes, Edited by R. Hartley, L. C. Thomas, and D. J. White, Academic Press, London, England, pp. 57–72, 1980.

    Google Scholar 

  17. Forestier, J. P., andVaraiya, P.,Multilayer Control of Large Markov Chains, IEEE Transactions on Automatic Control, Vol. AC-23, pp. 298–305, 1978.

    Google Scholar 

  18. Teneketzis, D., Javid, S. H., andShridhar, B. L.,Control of Weakly-Coupled Markov Chains, Proceedings of the 19th IEEE Conference on Decision and Control, 1980.

  19. White, C. C., andSchlussel, K.,Suboptimal Design for Large Scale Multimodule Systems, Operations Research, Vol. 29, pp. 865–875, 1981.

    Google Scholar 

  20. Bertsekas, D. P.,Dynamic Programming and Stochastic Control, Academic Press, New York, New York, 1976.

    Google Scholar 

  21. Popyack, J. L.,Approximating Markov Decision Processes with Multimodule Markov Decision Processes, University of Virginia, Department of Applied Mathematics and Computer Science, PhD Dissertation, 1982.

  22. Michael, A., andHerget, C. J.,Mathematical Foundations in Engineering and Science, Prentice-Hall, Englewood Cliffs, New Jersey, 1981.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Additional information

Communicated by R. A. Howard

This research has been supported by NSF Grants Nos. ECS-80-18266 and ECS-83-19355.

Rights and permissions

Reprints and permissions

About this article

Cite this article

White, C.C., Popyack, J.L. Suboptimal policy determination for large-scale Markov decision processes, Part 1: Description and bounds. J Optim Theory Appl 46, 319–341 (1985). https://doi.org/10.1007/BF00939287

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF00939287

Key Words

Navigation