Nonstationary Markov decision problems with converging parameters

Federgruen, A.; Schweitzer, P. J.

doi:10.1007/BF00935474

Nonstationary Markov decision problems with converging parameters

Contributed Papers
Published: June 1981

Volume 34, pages 207–241, (1981)
Cite this article

Journal of Optimization Theory and Applications Aims and scope Submit manuscript

A. Federgruen¹ &
P. J. Schweitzer²

93 Accesses
27 Citations
Explore all metrics

Abstract

This paper considers the solution of Markov decision problems whose parameters can be obtained only via approximating schemes, or where it is computationally preferable to approximate the parameters, rather than employing exact algorithms for their computation.

Various models are presented in which this situation occurs. Furthermore, it is shown that a modified value-iteration method may be employed, both for the discounted version and for the undiscounted version of the model, in order to solve the optimality equation and to find optimal policies. In both cases, the convergence rate is determined.

As a side result, we characterize the asymptotic behavior of backward products of a geometrically convergent sequence of Markov matrices.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Grinold, R.,Elimination of Suboptimal Actions in Markov Decision Problems, Operations Research, Vol. 27, pp. 848–851, 1973.
Google Scholar
Hastings, N., andMello, J.,Tests for Suboptimal Actions in Discounted Markov Programming, Management Science, Vol. 19, pp. 1019–1022, 1973.
Google Scholar
Macqueen, J.,A Test for Suboptimal Actions in Markov Decision Problems, Operations Research, Vol. 15, pp. 559–561, 1967.
Google Scholar
Porteus, E.,Some Bounds for Discounted Sequential Decision Processes, Management Science, Vol. 18, pp. 7–11, 1971.
Google Scholar
White, D.,Elimination of Non-optimal Actions in Markov Decision Processes, Dynamic Programming and Its Applications, Edited by M. Puterman, Academic Press, New York, New York, 1978.
Google Scholar
Hastings, N.,A Test for Nonoptimal Actions in Undiscounted Finite Markov Decision Chains, Management Science, Vol. 23, pp. 87–92, 1976.
Google Scholar
Federgruen, A., Schweitzer, P. J., andTijms, H. C.,Contraction Mappings Underlying Undiscounted Markov Decision Problems, Journal of Mathematical Analysis and Applications, Vol. 65, pp. 711–730, 1978.
Google Scholar
Luenberger, D.,Introduction to Linear and Nonlinear Programming, Addison-Wesley Publishing Company, Reading, Massachusetts, 1973.
Google Scholar
Goffin, J.,On Convergence Rates of Subgradient Optimization Methods, McGill University, Working Paper, No. 76–34, 1976.
Murray, W.,Numerical Methods for Unconstrained Optimization, Academic Press, New York, New York, 1972.
Google Scholar
Odoni, A.,On Finding the Maximal Gain for Markov Decision Processes, Operations Research, Vol. 17, pp. 857–860, 1969.
Google Scholar
Jewell, W.,Markov Renewal Programming, Operations Research, Vol. 11, pp. 938–971, 1963.
Google Scholar
Russel, C.,An Optimal Policy for Operating a Multipurpose Reservoir, Operations Research, Vol. 20, pp. 1181–1189, 1972.
Google Scholar
Verkhovsky, B.,Smoothing System Design and Parametric Markovian Programming, Markov Decision Theory, Edited by H. Tijons and J. Wessels, Mathematical Center, Amsterdam, Holland, 1977.
Google Scholar
Verkhovsky, B., andSpivak, V.,Water Systems Optimal Design and Controlled Stochastic Processes, Ekonomika 1, Matematicheskie Metody, Vol. 8, pp. 966–972, 1972.
Google Scholar
Sobel, M.,Optimal Operation of Queues, Mathematical Methods in Queueing Theory, Lecture Notes in Economics and Mathematical Systems, Edited by A. B. Clarke, Springer-Verlag, Berlin, Germany, 1976.
Google Scholar
Deleve, G., Federgruen, A., andTijms, H. C.,A General Markov Decision Method, II, Advances in Applied Probability, Vol. 9, pp. 316–335, 1977.
Google Scholar
Lippman, S.,Applying a New Device in the Optimization of Exponential Queueing Systems, Operations Research, Vol. 23, pp. 687–711, 1975.
Google Scholar
Schweitzer, P. J.,Iterative Solution of the Functional Equations for Undiscounted Markov Renewal Programming, Journal of Mathematical Analysis and Applications, Vol. 34, pp. 495–501, 1971.
Google Scholar
Denardo, E.,Markov Renewal Programs with Small Interest Rates, Annals of Mathematical Statistics, Vol. 42, pp. 477–496, 1971.
Google Scholar
Miller, B., andVeinott, A., Jr.,Discrete Dynamic Programming with a Small Interest Rate, Annals of Mathematical Statistics, Vol. 40, pp. 366–370, 1969.
Google Scholar
Veinott, A., Jr.,Discrete Dynamic Programming with Sensitive Discount Optimality Criteria, Annals of Mathematical Statistics, Vol. 40, pp. 1635–1640, 1969.
Google Scholar
Federgruen, A., andSchweitzer, P. J.,Successive Approximation Methods for Solving Nested Functional Equations in Markov Decision Theory, University of Rochester, Graduate School of Management, Working Paper No. 7908, 1979.
Shapley, L.,Stochastic Games, Proceedings of the National Academy of Sciences, Vol. 39, pp. 1095–1100, 1953.
Google Scholar
Denardo, E.,Contraction Mappings in the Theory Underlying Dynamic Programming, SIAM Review, Vol. 9, pp. 165–177, 1967.
Google Scholar
White, D.,Dynamic Programming, Markov Chains and the Method of Successive Approximations, Journal of Mathematical Analysis and Applications, Vol. 6, pp. 373–376, 1963.
Google Scholar
Brown, B.,On the Iterative Method of Dynamic Programming on a Finite State Space, Discrete-Time Markov Process, Annals of Mathematical Statistics, Vol. 36, pp. 1279–1285, 1965.
Google Scholar
Schweitzer, P. J.,Perturbation Theory and Markovian Decision Processes, Massachusetts Institute of Technology, Operations Research Center, PhD Dissertation, 1965
Lanery, E., Etude Asymptotique des Systèmes Markoviens à Commande, Revue de l'Informatique et de la Recherche Opérationelle, Vol. 1, pp. 3–5, 1967.
Google Scholar
Schweitzer, P. J., andFedergruen, A.,The Asymptotic Behavior of Undiscounted Value Iteration in Markov Decision Problems, Mathematics of Operations Research, Vol. 2, pp. 360–381, 1978.
Google Scholar
Schweitzer, P. J., andFedergruen, A.,Geometric Convergence of Value-Iteration in Multichain Markov Decision Problems, Advances in Applied Probability, Vol. 11, pp. 188–217, 1979.
Google Scholar
Federgruen, A., andSchweitzer, P. J.,Discounted and Undiscounted Value-Iteration in Markov Decision Problems, Dynamic Programming and Its Applications, Edited by M. Puterman, Academic Press, New York, New York, 1978.
Google Scholar
Denardo, E., andFox, B.,Multichain Markov Renewal Programs, SIAM Journal on Applied Mathematics, Vol. 16, pp. 468–487, 1968.
Google Scholar
Anthonisse, J., andTijms, H.,Exponential Convergence of Products of Stochastic Matrices, Journal of Mathematical Analysis and Applications, Vol. 59, pp. 360–364, 1979.
Google Scholar
Chatterjee, S., andSeneta, E.,Toward Consensus: Some Convergence Theorems on Repeated Averaging, Journal of Applied Probability, Vol. 14, pp. 89–97, 1977.
Google Scholar
Federgruen, A.,The Rate of Convergence for Backwards Products of a Convergent Sequence of Finite Markov Matrices, University of Rochester, Graduate School of Management, Working Paper No. 7827, 1978.
Huang, C., Isaacson, D., andVinograde, B.,The Rate of Convergence of Certain Nonhomogeneous Markov Chains, Zeitschrift für Wahrschetnlichkeits-theorie, Vol. 35, pp. 141–146, 1976.
Google Scholar
Schweitzer, P. J.,Perturbation Theory and Finite Markov Chains, Journal of Applied Probability, Vol. 5, pp. 401–413, 1968.
Google Scholar

Download references

Author information

Authors and Affiliations

Graduate School of Business, Columbia University, New York, New York
A. Federgruen (Assistant Professor)
Graduate School of Management, University of Rochester, Rochester, New York
P. J. Schweitzer (Professor)

Authors

A. Federgruen
View author publications
You can also search for this author in PubMed Google Scholar
P. J. Schweitzer
View author publications
You can also search for this author in PubMed Google Scholar

Additional information

Communicated by R. A. Howard

Rights and permissions

Reprints and permissions

About this article

Cite this article

Federgruen, A., Schweitzer, P.J. Nonstationary Markov decision problems with converging parameters. J Optim Theory Appl 34, 207–241 (1981). https://doi.org/10.1007/BF00935474

Download citation

Issue Date: June 1981
DOI: https://doi.org/10.1007/BF00935474

Key Words

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Nonstationary Markov decision problems with converging parameters

Abstract

Access this article

Similar content being viewed by others

Existence and Uniqueness of Quasi-stationary Distributions for Symmetric Markov Processes with Tightness Property

Random Gradient-Free Minimization of Convex Functions

Symmetric Markov Processes with Tightness Property

References

Author information

Authors and Affiliations

Additional information

Rights and permissions

About this article

Cite this article

Key Words

Navigation

Nonstationary Markov decision problems with converging parameters

Abstract

Access this article

Similar content being viewed by others

Existence and Uniqueness of Quasi-stationary Distributions for Symmetric Markov Processes with Tightness Property

Random Gradient-Free Minimization of Convex Functions

Symmetric Markov Processes with Tightness Property

References

Author information

Authors and Affiliations

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Key Words

Search

Navigation