Skip to main content
Log in

On the construction of ε-optimal strategies in partially observed MDPs

  • Published:
Annals of Operations Research Aims and scope Submit manuscript

Abstract

The purpose of the paper is to give a survey of methods, partly derived by the author in joint work with other researchers, concerning the problem of constructingε-optimal strategies for partially observable MDPs. The methods basically consist in transforming the problem into one of approximation: Starting from the original problem a sequence of approximating problems is constructed such that:

  1. (i)

    For each approximating problem an optimal strategy can actually be computed.

  2. (ii)

    Givenε>0, there exists an approximating problem such that the optimal strategy for the latter isε-optimal for the original problem.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. A. Bensoussan and W.J. Runggaldier, An approximation method for stochastic control problems with partial observation of the state — a method for constructingε-optimal controls, Acta Appl. Math. 10 (1987) 145–170.

    Google Scholar 

  2. D.P. Bertsekas, Convergence of discretization procedures in dynamic programming, IEEE Trans. Automat. Control AC-20 (1975) 415–419.

    Google Scholar 

  3. M.H.A. Davis and S.I. Marcus, An introduction to nonlinear filtering, in:Stochastic Systems: the Mathematics of Filtering and Identification and Applications, eds. M. Hazewinkel and J.C. Willems (D. Reidel, 1981) pp. 53–75.

  4. G.B. Di Masi and W.J. Runggaldier, An approach to discrete-time stochastic control problems under partial observation, SIAM J. Control Optim. 25 (1987) 38–48.

    Google Scholar 

  5. G.B. Di Masi, W.J. Runggaldier and B. Armellin, On recursive approximations with error bounds in nonlinear filtering, in:Stochastic Optimization, eds. V.I. Arkin, A. Shiryayev and R. Wets, LN in Contr. and Info. Sci., IIASA 81, (Springer, 1986) pp. 127–136.

  6. I.V. Girsanov, On transforming a certain class of stochastic processes by absolutely continuous substitution of measures, Theory Probab. Appl. 5 (1960) 285–301.

    Google Scholar 

  7. K. Hinderer, On approximate solutions of finite-stage dynamic programs, in:Dynamic Programming and its Applications, ed. M. Puterman (Academic Press, 1979) pp. 289–317.

  8. H.J. Kushner,Probability Methods for Approximations in Stochastic Control and for Elliptic Equations (Academic Press, 1977).

  9. H.J. Kushner, Numerical methods for stochastic control problems in continuous time, Lefschetz Center for Dynamical Systems Report (July 1988), to appear also as invited survey in SIAM J. Control Optim. Also: SIAM J. Control Optim. 28 (1990) 999–1048.

    Google Scholar 

  10. G.E. Monahan, A survey of partially observable Markov decision processes: theory, models, and algorithms, Manag. Sci. 28 (1982) 1–16.

    Google Scholar 

  11. W.J. Runggaldier and O. Zane, Approximations for discrete-time adaptive control. Construction of ε-optimal controls, to appear in Math. Control, Signals, and Systems.

  12. W.J. Runggaldier and L. Stettner, On the construction of nearly optimal strategies for a general problem of control of partially observed diffusions, IMPAN (Institute of Mathematics, Polish Academy of Sciences) Preprint No. 450 (1989). To appear in Stochastics and Stoch. Rep.

  13. J. Satia and R. Lave, Markovian decision processes with probabilistic observation of states, Manag. Sci. 20 (1973) 1–13.

    Google Scholar 

  14. R.D. Smallwood and E.J. Sondik, The optimal control of partially observable Markov processes over a finite horizon, Oper. Res. 21 (1973) 1071–1088.

    Article  Google Scholar 

  15. E.J. Sondik, The optimal control of partially observable Markov processes over the infinite horizon: discounted costs, Oper. Res. 26 (1978) 282–304.

    Article  Google Scholar 

  16. C.C. White III and W.T. Scherer, Solution procedures for partially observed Markov decision processes, Oper. Res. 37 (1989) 791–797.

    Article  Google Scholar 

  17. W. Whitt, Approximations of dynamic programs I, Math. Oper. Res. 3 (1978) 231–243.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Runggaldier, W.J. On the construction of ε-optimal strategies in partially observed MDPs. Ann Oper Res 28, 81–95 (1991). https://doi.org/10.1007/BF02055576

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF02055576

Keywords

Navigation