Asymptotic expansions for dynamic programming recursions with general nonnegative matrices

  • W. H. M. Zijm
Contributed Papers

Abstract

This paper is concerned with the study of the asymptotic behavior of dynamic programming recursions of the form
$$x(n + 1) = \mathop {\max }\limits_{P \in \mathcal{K}} Px(n), n = 0,1,2,...,$$
where ℜ denotes a set of matrices, generated by all possible interchanges of corresponding rows, taken from a fixed finite set of nonnegative square matrices. These recursions arise in a number of well-known and frequently studied problems, e.g. in the theory of controlled Markov chains, Leontief substitution systems, controlled branching processes, etc. Results concerning the asymptotic behavior ofx(n), forn→∞, are established in terms of the maximal spectral radius, the maximal index, and a set of generalized eigenvectors. A key role in the analysis is played by a geometric convergence result for value iteration in undiscounted multichain Markov decision processes. A new proof of this result is also presented.

Key Words

Dynamic programming nonnegative matrices asymptotic expansions generalized eigenvectors geometric convergence 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Bellman, R.,Dynamic Programming, Princeton University Press, Princeton, New Jersey, 1957.Google Scholar
  2. 2.
    Howard, R. A.,Dynamic Programming and Markov Processes, MIT Press, Cambridge, Massachusetts, 1960.Google Scholar
  3. 3.
    Howard, R. A., andMatheson, J. E.,Risk-Sensitive Markov Decision Processes, Management Sciences, Vol. 18, pp. 356–369, 1972.Google Scholar
  4. 4.
    Pliska, S. R.,Optimization of Multitype Branching Processes, Management Sciences, Vol. 23, pp. 117–124, 1976.Google Scholar
  5. 5.
    Burmeister, E., andDobell, R.,Mathematical Theories of Economic Growth, Macmillan, New York, New York, 1970.Google Scholar
  6. 6.
    Bellman, R.,On a Class of Quasi-Linear Equations, Canadian Journal of Mathematics, Vol. 8, pp. 198–202, 1956.Google Scholar
  7. 7.
    Zijm, W. H. M.,Nonnegative Matrices in Dynamic Programming, PhD Thesis, Eindhoven University of Technology, Eindhoven, Holland, 1982.Google Scholar
  8. 8.
    Pease, M. C.,Methods of Matrix Algebra, Academic Press, New York, New York, 1965.Google Scholar
  9. 9.
    Zijm, W. H. M.,Generalized Eigenvectors and Sets of Nonnegative Matrices, Linear Algebra and Applications, Vol. 59, pp. 91–113, 1984.Google Scholar
  10. 10.
    Schweitzer, P., andFedergruen, A.,Geometric Convergence of Value-Iteration in Multichain Markov Decision Problems, Advances in Applied Probability, Vol. 11, pp. 188–217, 1979.Google Scholar
  11. 11.
    Sladky, K.,On Dynamic Programming Recursions for Multiplicative Markov Decision Chains, Mathematical Programming Study, Vol. 6, pp. 216–226, 1976.Google Scholar
  12. 12.
    Sladky, K.,Successive Approximation Methods for Dynamic Programming Models, Proceedings of the 3rd Formator Symposium on Mathematical Methods for the Analysis of Large-Scale Systems, Prague, Czechoslovakia, pp. 171–189, 1979.Google Scholar
  13. 13.
    Sladky, K.,Bounds on Discrete Dynamic Programming Recursions, I: Models with Nonnegative Matrices, Kybernetica, Vol. 16, pp. 526–547, 1980.Google Scholar
  14. 14.
    Rothblum, U. G.,Sensitive Growth Analysis of Multiplicative Systems, I: The Dynamic Approach, SIAM Journal of Algebraic Discrete Mathematics, Vol. 2, pp. 25–34, 1981.Google Scholar
  15. 15.
    Rothblum, U. G.,Expansions of Sums of Matrix Powers, SIAM Review, Vol. 23, pp. 143–164, 1981.Google Scholar
  16. 16.
    Gantmacher, F. R.,The Theory of Matrices, Vol. 2 (Translated by K. A. Hirsch), Chelsea, New York, New York, 1959.Google Scholar
  17. 17.
    Seneta, E.,Nonnegative Matrices, Allen and Unwin, London, England, 1973.Google Scholar
  18. 18.
    Karlin, S.,A First Course in Stochastic Processes, Academic Press, New York, New York, 1966.Google Scholar
  19. 19.
    Rothblum, U. G.,Algebraic Eigenspaces of Nonnegative Matrices, Linear Algebra and Applications, Vol. 12, pp. 281–292, 1975.Google Scholar
  20. 20.
    Mandl, P., andSeneta, E.,The Theory of Nonnegative Matrices in a Dynamic Programming Problem, Australian Journal of Statistics, Vol. 11, pp. 85–96, 1969.Google Scholar
  21. 21.
    Rothblum, U. G., andWhittle, P.,Growth Optimality for Branching Markov Decision Chains, Mathematics of Operations Research, Vol. 7, pp. 582–601, 1982.Google Scholar
  22. 22.
    Denardo, E. V.,Contraction Mappings in the Theory Underlying Dynamic Programming, SIAM Review, Vol. 9, pp. 165–177, 1967.Google Scholar
  23. 23.
    Lanery, E., Etude Asymptotique des Systèmes Markoviens à Commande, Revue d'Informatique et des Recherches Operationnelles, Vol. 1, pp. 3–56, 1967.Google Scholar
  24. 24.
    Federgruen, A., andSchweitzer, P. J.,Discounted and Undiscounted Value Iteration in Markov Decision Processes: A Survey, Dynamic Programming and Its Applications, Edited by M. L. Puterman, Academic Press, New York, New York, pp. 23–52, 1978.Google Scholar
  25. 25.
    Miller, B. L., andVeinott, A. F.,Discrete Dynamic Programming with Small Interest Rate, Annals of Mathematical Statistics, Vol. 40, pp. 366–370, 1966.Google Scholar
  26. 26.
    Sladky, K.,On the Set of Optimal Controls for Markov Chains with Rewards, Kybernetica, Vol. 10, pp. 350–367, 1974.Google Scholar
  27. 27.
    Van Der Wal, J.,Stochastic Dynamic Programming, Mathematical Center, Amsterdam, Holland, 1981.Google Scholar
  28. 28.
    Zijm, W. H. M.,Continuous-Time Dynamic Programming Models, Proceedings of the 4th Formator Symposium on Mathematical Methods for the Analysis of Large-Scale Systems, Prague, Czechoslovakia, pp. 589–601, 1982.Google Scholar
  29. 29.
    Zijm, W. H. M.,Exponential Convergence in Undiscounted Continuous-Time Markov Decision Chains, CQM-Note No. 023, Philips, Eindhoven, Holland, 1984.Google Scholar
  30. 30.
    Zijm, W. H. M.,R-Theory for Countable Reducible Nonnegative Matrices, Stochastics, Vol. 10, pp. 243–271, 1983.Google Scholar
  31. 31.
    Derman, C.,Finite State Markovian Decision Processes, Academic Press, New York, New York, 1970.Google Scholar
  32. 32.
    Rothblum, U. G.,Multiplicative Markov Decision Chains, Mathematics of Operations Research, Vol. 9, pp. 6–24, 1984.Google Scholar
  33. 33.
    Denardo, E. V.,A Markov Decision Problem, Mathematical Programming, Edited by T. C. Hu and S. M. Robinson, Academic Press, New York, New York, pp. 33–68, 1973.Google Scholar

Copyright information

© Plenum Publishing Corporation 1987

Authors and Affiliations

  • W. H. M. Zijm
    • 1
  1. 1.Centre for Quantitative MethodsNederlandse Philips BedrijvenEindhovenHolland

Personalised recommendations