Summary
The question of the existence of good Markov [good stationary] policies is studied for a general class of Borel [stationary] dynamic programming models. It is shown, for example, that Markov [stationary] policies are uniformly adequate if every transition law is absolutely continuous with respect to a fixed measure [and the reward function is positive or the model satisfies certain compactness and continuity conditions].
References
Barbosa-Dantas, C.A.: The existence of stationary optimal plans. Dissertation. Univ. of California, Berkeley (1966)
Bertsekas, D.P., Shreve, S.E.: Stochastic optimal control. New York: Academic Press 1978
Blackwell, D.: Positive dynamic programming. Proc. Fifth Berkeley Sympos. Math. Stat. Probab.1, 415–418. University of California (1965)
Blackwell, D.: The stochastic processes of Borel gambling and dynamic programming. Ann. Stat.4, 370–374 (1976)
Blackwell, D., Freedman, D., Orkin, M.: The optimal reward operator in dynamic programming. Ann. Probab.2, 926–941 (1974)
Bodewig, H.-H.: Über dynamische Optimierung mit endlich additiven Maßen. Master thesis, University of Bonn (1979)
Brown, L.D., Purves, R.: Measurable selections of extrema. Ann. Stat.1, 902–912 (1973)
Dawen, van R.: Stationäre Politiken in stochastischen Entscheidungsproblemen. Dissertation, Univ. of Bonn (1983). One part will appear in Math. of Op. Res. under the title: Pointwise and uniformly good stationary strategies for dynamic programming models
Dawen, van R.: On stationary strategies in positive stochastic 1 and 2 person games with general state space. ZAMM64, T327–328 (1984)
Dawen, van R., Schäl, M.: On the existence of stationary optimal policies in Markov decision models. ZAMM63, T403–404 (1983a)
Dubins, L.E., Savage, L.J.: How to gamble if you must. New York: McGraw-Hill 1965
Dubins, L.E., Sudderth, W.: Persistently ε-optimal strategies. Math. Op. Res.2, 125–134 (1977a)
Dubins, L.E., Sudderth, W.: Countably additive gambling and optimal stopping. Z. Wahrscheinlichkeitstheor. Verw. Geb.41, 59–72 (1977b)
Dubins, L.E., Sudderth, W.: On stationary strategies for absolutely continuous houses. Ann. Probab.7, 461–467 (1979)
Engelbert, A., Engelbert, H.J.: Optimal stopping and almost sure convergence of random sequences. Z. f. Wahrscheinlichkeitstheor. Verw. Geb.48, 309–325 (1979)
Fainberg, E.A., Sonin, I.M.: Stationary and Markov policies in countable state dynamic programming. Lect. Notes Math.1021, 111–129. Berlin Heidelberg New York: Springer 1983
Fainberg, E.A., Sonin, I.M.: Persistently nearly optimal strategies in stochastic dynamic programming. Statistics and control of stochastic processes, Proc. Steklov Semin., Moscow 1984, Transl. Ser. Math. Engl. (1985)
Federgruen, A., Hordijk, A., Tijms, H.C.: Denumerable state semi-Markov decision processes with unbounded costs. Average cost criterion. Stochastic Processes Appl.9, 223–235 (1979)
Frid, E.B.: On a problem of D. Blackwell from the theory of dynamic programming, Theor. Probab. Appl.15, 719–722 (1976)
Hinderer, K.: Foundations of non-stationary dynamic programming with discrete time-parameter. Lect. Notes Operat. Res. Math. Systems33. Berlin Heidelberg New York: Springer 1970
Kertz, R.P.: Renewal plans and persistent optimality in countable additive gambling. Math. Op. Res.7, 361–382 (1982)
Ornstein, D.: On the existence of stationary optimal strategies. Proc. Am. Math. Soc.20, 563–569 (1969)
Schäl, M.: Conditions for optimality in dynamic programming and for the limit ofn-stage optimal policies to be optimal. Z. Wahrscheinlichkeitstheor. Verw. Geb.32, 179–196 (1975)
Schäl, M.: Stationary policies in dynamic programming models under compactness assumptions. Math. Op. Res.8, 366–372 (1983)
Strauch, R.E.: Negative dynamic programming. Ann. Math. Stat.37, 871–890 (1966)
Sudderth, W.D.: On the existence of good stationary strategies. Trans. Am. Math. Soc.135, 399–414 (1965)
Sudderth, W.D.: A ‘Fatou equation’ for randomly stopped variables. Ann. Math. Stat.42, 2143–2146 (1971)
Wal, van der J.: On stationary strategies in countable state total reward Markov decision processes. Math. Op. Res.9, 290–300 (1984)
Wal, van der J.: On uniformly nearly-optimal Markov strategies, pp. 461–467. Operations Research Proceedings 1982 (1983)
Author information
Authors and Affiliations
Additional information
Research supported by “Deutsche Forschungsgemeinschaft, Sonderforschungsbereich 72”
Research supported by National Science Foundation Grant MCS 8100789
Rights and permissions
About this article
Cite this article
Schäl, M., Sudderth, W. Stationary policies and Markov policies in Borel dynamic programming. Probab. Th. Rel. Fields 74, 91–111 (1987). https://doi.org/10.1007/BF01845641
Received:
Revised:
Issue Date:
DOI: https://doi.org/10.1007/BF01845641
Keywords
- Stochastic Process
- Probability Theory
- Dynamic Programming
- Programming Model
- Stationary Policy