On the computation of the optimal cost function for discrete time Markov models with partial observations

Sernik, Enrique L.; Marcus, Steven I.

doi:10.1007/BF02283611

On the computation of the optimal cost function for discrete time Markov models with partial observations

Partially Observable Markov Decision Processes
Published: December 1991

Volume 29, pages 471–511, (1991)
Cite this article

Annals of Operations Research Aims and scope Submit manuscript

Enrique L. Sernik¹ &
Steven I. Marcus¹

130 Accesses
7 Citations
Explore all metrics

Abstract

We consider several applications of two state, finite action, infinite horizon, discrete-time Markov decision processes with partial observations, for two special cases of observation quality, and show that in each of these cases the optimal cost function is piecewise linear. This in turn allows us to obtain either explicit formulas or simplified algorithms to compute the optimal cost function and the associated optimal control policy. Several examples are presented.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Partially observed optimal stopping problem for discrete-time Markov processes

Article 21 December 2016

Approximation of Infinite Horizon Discounted Cost Markov Decision Processes

Constrained Continuous-Time Markov Decision Processes on the Finite Horizon

Article 15 April 2016

References

S.C. Albright, Structural results for partially observable Markov decision processes, Oper. Res. 27 (1979) 1041–1053.
Google Scholar
V.A. Andriyanov, I.A. Kogan and G.A. Umnov, Optimal control of a partially observable discrete Markov process, Autom. Remote Contr. 4 (1980) 555–561.
Google Scholar
K.J. Āström, Optimal control of Markov processes with incomplete state information, J. Math. Anal. Appl. 10 (1965) 174–205.
Google Scholar
A. Ben-Israel and S.D. Flam, Input optimization for infinite discounted programs, J. Optim. Theory Appl. 61 (1989) 347–357.
Google Scholar
D.P. Bertsekas,Dynamic Programming (Prentice Hall, Englewood Cliffs, NJ, 1987).
Google Scholar
D.P. Bertsekas and S.E. Shreve,Stochastic Optimal Control: The Discrete Time Case (Academic Press, New York, 1978).
Google Scholar
A. Federgruen and P.J. Schweitzer, Discounted and undiscounted value iteration in Markov decision problems: A survey, in:Dynamic Programming and its Applications, ed. M. Puterman (Academic Press, 1979) pp. 23–52.
E. Fernandez-Gaucherand, A. Araposthatis and S.I. Marcus, On partially observable Markov decision processes with an average cost criterion,Proc. 28th IEEE Conf. on Decision and Control, Tampa, Florida (1989) 1267–1272.
E. Fernandez-Gaucherand, A. Arapostathis and S.I. Marcus, On the average cost optimality equation and the structure of optimal policies for partially observable Markov decision processes, this volume.
A. Gheorghe, Partially observable Markov processes with a risk sensitivity decision maker, Rev. Roumaine Math. Pures Appl. 22 (1977) 461–482.
Google Scholar
W.J. Hopp and S.C. Wu, Multiaction maintenance under Markovian deterioration and incomplete state information, Naval Res. Log. Quart. 35 (1988) 447–462.
Google Scholar
J.S. Hughes, Optimal internal audit timing, The Accounting Review 52 (1977) 56–68.
Google Scholar
J.S. Hughes, A note on quality control under Markovian deterioration, Oper. Res. 28 (1980) 421–424.
Google Scholar
S.H. Kim, State information lag Markov process with control limit rule, Naval Res. Log. Quart. 32 (1985) 491–496.
Google Scholar
S.H. Kim and B.H. Jeong, A partially observable Markov decision process with lagged information, J. Oper. Res. Soc. 38 (1987) 439–446.
Google Scholar
P.R. Kumar and T.I. Seidman, On the optimal solution of the one armed bandit adaptive control problem, IEEE Trans. Automatic Control, 26 (1981) 1176–1184.
Google Scholar
J.J. Martin,Bayesian Decision Problems and Markov Chains (Wiley, New York, 1967).
Google Scholar
G. Monahan, Optimal stopping in a partially observable binary-valued Markov chain with costly perfect information, J. Appl. Prob. 19 (1982) 72–81.
Google Scholar
S.M. Pollock, Minimum-cost checking using imperfect information, Management Sci. 13 (1967) 454–465.
Google Scholar
S.M. Ross, Quality control under Markovian deterioration, Management Sci. 17 (1971) 587–596.
Google Scholar
J.K. Satia and R.E. Lave, Markovian decision processes with probabilistic observation of states, Management Sci. 20 (1973) 1–13.
Google Scholar
K. Sawaki and A. Ichikawa, Optimal control for partially observable Markov decision processes over an infinite horizon, J. Oper. Res. Soc. Japan 21 (1978) 1–15.
Google Scholar
K. Sawaki, Transformation of partially observable Markov decision processes into piecewise linear ones, J. Math. Anal. Appl. 91 (1983) 112–118.
Google Scholar
E.L. Sernik and S.I. Marcus, Comments on the sensitivity of the optimal cost and the optimal policy for a discrete Markov decision process,Proc. 27th Annual Allerton Conf. on Communication, Control and Computing, Monticello, Illinois (1989) pp. 935–944.
E.L. Sernik and S.I. Marcus, On the optimal cost and policy for a Markovian replacement problem (1990), to appear in J. Optim. Theory Appl.
E.J. Sondik, The optimal control of partially observable Markov processes, Ph. D. Thesis, Department of Electrical Engineering Systems, Stanford University (1971).
E.J. Sondik, The optimal control of partially observable Markov processes over the infinite horizon: Discounted costs, Oper. Res. 26 (1978) 282–304.
Google Scholar
L.C. Thomas, P.A. Jacobs and D.P. Gaver, Optimal inspection policies for standby systems, Comm. Stat. — Stochastic Models 3 (1987) 259–273.
Google Scholar
R.C. Wang, Computing optimal quality control policies — Two actions, J. Appl. Prob. 13 (1976) 826–832.
Google Scholar
R.C. Wang, Optimal replacement policy with unobservable states, J. Appl. Prob. 14 (1977) 340–348.
Google Scholar
C.C. White, A Markov quality control process subject to partial observation, Management Sci. 23 (1977) 843–852.
Google Scholar
C.C. White, Optimal inspection and repair of a production process subject to deterioration, J. Oper. Res. Soc. 29 (1978) 235–243.
Google Scholar
C.C. White, Bounds on the optimal cost for a replacement problem with partial observations, Naval. Res. Log. Quart. 26 (1979) 415–422.
Google Scholar
C.C. White, Note on “A partially observable Markov decision process with lagged information”, J. Oper. Res. Soc. 39 (1988) 217–218.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electrical and Computer Engineering, The University of Texas at Austin, 78712-1084, Austin, Texas, USA
Enrique L. Sernik & Steven I. Marcus

Authors

Enrique L. Sernik
View author publications
You can also search for this author in PubMed Google Scholar
Steven I. Marcus
View author publications
You can also search for this author in PubMed Google Scholar

Additional information

Research supported in part by the Air Force Office of Scientific Research under Grant AFOSR-86-0029, in part by the National Science Foundation under Grant ECS-8617860, in part by the Advanced Technology Program of the State of Texas, and in part by the DoD Joint Services Electronics Program through the Air Force Office of Scientific Research (AFSC) Contract F49620-86-C-0045.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sernik, E.L., Marcus, S.I. On the computation of the optimal cost function for discrete time Markov models with partial observations. Ann Oper Res 29, 471–511 (1991). https://doi.org/10.1007/BF02283611

Download citation

Issue Date: December 1991
DOI: https://doi.org/10.1007/BF02283611

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

On the computation of the optimal cost function for discrete time Markov models with partial observations

Abstract

Access this article

Similar content being viewed by others

Partially observed optimal stopping problem for discrete-time Markov processes

Approximation of Infinite Horizon Discounted Cost Markov Decision Processes

Constrained Continuous-Time Markov Decision Processes on the Finite Horizon

References

Author information

Authors and Affiliations

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

On the computation of the optimal cost function for discrete time Markov models with partial observations

Abstract

Access this article

Similar content being viewed by others

Partially observed optimal stopping problem for discrete-time Markov processes

Approximation of Infinite Horizon Discounted Cost Markov Decision Processes

Constrained Continuous-Time Markov Decision Processes on the Finite Horizon

References

Author information

Authors and Affiliations

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation