Abstract
The pair of functional equations for undiscounted Markov renewal programs (MRPs) are solved by an iterative procedure which generates geometrically-converging estimates for the gain rate vectorg * and relative value vector. In addition, monotonically-converging lower bounds ong * are exhibited. The approach emplys a hierarchical decomposition of the MRP into a set of communicating MRPs, with Hasting's bounds used to estimate thescalar gain rate for each member of the set.
Zusammenfassung
Die Funktionalgleichungen für undiskontierte Markoffsche Erneuerungsprogramme (MEP) werden mit einem iterativen Verfahren gelöst, das geometrisch-konvergierende Abschätzungen für den Durchschnittsgewinng * und den relativen Gewinn liefert. Außerdem werden monoton-konvergierende untere Schranken fürg * hergeleitet. Dabei wird eine Zerlegung des MEP in eine Menge kommunizierender MEP benutzt und werden Abschätzungen von Hastings verwendet.
Similar content being viewed by others
References
Bather, J.: Optimal Decision Procedures for Finite Markov Chains. Adv. in Appl. Prob.5 (1973) PartsI, II, III, 328–339, 521–540, 541–553.
de Cani, J.: A Dynamic Programming Algorithm for Embedded Markov Chains When the Planning Horizon is at Infinity. Man. Sci.10, 1964, 716–733.
Denardo, E.V., andB.L. Fox: Multichain Markov Renewal Programs. SIAM J. Appl. Math.16, 1968, 468–487.
Federgruen, A., andD. Spreen: A New Specification of the Multichain Policy Iteration Algorithm in Undiscounted Markov Renewal Programs. Man. Sci.26, 1980, 1211–1217.
Federgruen, A., andP.J. Schweitzer: Non-Stationary Markov Decision Processes with Converging Parameters. J. Opt. Th. and Its Appl.34, 1981, 207–242.
-: A Fixed Point Approach to Undiscounted Markov Renewal Programs. To appear in SIAM J. Algebraic and Discrete methods 1984.
Fox, B.L., andD.M. Landi: An Algorithm for Identifying the Ergodic Subchains and Transient States of a Stochastic Matrix. Comm. ACM11, 1968, 619–621.
Hordijk, A., andL.C.M. Kallenberg: Linear Programming and Markov Decision Chains. Man. Sci.25, 1979, 352–362.
Howard, R.A.: Research in Semi-Markovian Decision Structures. J. Oper. Res. Soc. Japan6, 1964, 163–199.
Jewell, W.S.: Markov Renewal Programming. I and II. Oper. Res.11, 1963, 938–971.
Platzman, L.: Improved Conditions for Convergence in Undiscounted Markov Renewal Programming. Oper. Res.25, 1977, 529–533.
Schweitzer, P.J.: Iterative Solution of the Functional Equations of Undiscounted Markov Renewal Programming. J. Math. Anal. Appl.34, 1971, 495–501.
Schweitzer, P.J., andA. Federgruen: The Functional Equations of Undiscounted Markov Renewal Programming. Math. of Oper. Res.3, 1978a, 308–322.
—: Foolproof Convergence in Multichain Policy Iteration. J. Math. Anal. Appl.64, 1978b, 360–368.
—: Geometric Convergence of Value-Iteration in Multichain Markov Decision Problems. Adv. in Appl. Prob.11, 1979, 188–217.
Veinott, A.F.: On Finding Optimal Policies in Discrete Dynamic Programming with no Discounting. Ann. Math. Statist.37, 1966, 1284–1294.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Schweitzer, P.J. A value-iteration scheme for undiscounted multichain Markov renewal programs. Zeitschrift für Operations Research 28, 143–152 (1984). https://doi.org/10.1007/BF01920916
Received:
Revised:
Issue Date:
DOI: https://doi.org/10.1007/BF01920916