Abstract
An iterative decomposition method is presented for computing the values in an infinite-horizon discounted Markov renewal program (DMRP). The states are partitioned intoM groups, with each iteration involving disaggregation of one group at a time, with the otherM−1 groups being collapsed intoM−1 singletons using the replacement process method. Each disaggregation also looks like a DMRP and can be performed by policy-iteration, value-iteration or linear programming. Anticipated benefits from the method include reduced computer time and memory requirements, scale-invariance and greater robustness to the starting point.
Similar content being viewed by others
References
D.P. Bertsekas and D.A. Castanon, Adaptive aggregation for infinite horizon dynamic programming, IEEE Trans. Automatic Control AC-34 (1989) 589–598.
C. Derman,Finite State Markovian Decision Processes (Academic Press, New York, 1970).
A. Federgruen and P.J. Schweitzer, A survey of asymptotic value-iteration for undiscounted Markovian decision process, in:Recent Developments in Markov Decision Progress, R. Hartley, L.C. Thomas, D.J. White (eds.) (Academic Press, New York, 1980) pp. 73–109.
R.A. Howard, Semi-Markov decision processes, Bull. Int. Statist. Inst. 40 (2) (1963) 625–652.
W. Jewell, Markov-renewal programming I and II, Oper. Res. 11 (1963) 938–972.
S. Lippman, Applying a new device in the optimization of exponential systems, Oper. Res. 23 (1975) 687–710.
J. MacQueen, A modified dynamic programming method for Markov decision processes, J. Math. Anal. Appl. 14 (1966) 38–43.
R. Mendelssohn, An iterative aggregation procedure for Markov decision processes, Oper. Res. 30 (1982) 62–73.
C.D. Meyer, Stochastic complementation, uncoupling Markov chains, and the theory of nearly reducible systems, SIAM Rev. 31 (1989) 240–272.
K. Ohno, A unified approach to algorithms with suboptimality test in discounted semi-Markov decision processes, J. Oper. Res. Soc. Japan 24 (1981) 296–324.
S. Osaki and H. Mine, Linear programming algorithms for semi-Markovian decision processes, J. Math. Anal. Appl. 22 (1968) 256–281.
J.L. Popyack, R.L. Brown and C.C. White III, Discrete version of an algorithm due to Varaiya, IEEE Trans. Automatic Control AC-24 (1979) 503–504.
E. Porteus, Some bounds for discounted sequential decision processes, Manag. Sci. 8 (1971) 7–11.
P.J. Schweitzer, Bounds on the fixed point of a monotone contraction operator, J. Math. Anal. Appl. 123 (1987) 376–388.
P.J. Schweitzer, Solving Markovian decision process by successive elimination of variables, J. Math. Anal. Appl. 130 (1988) 403–419.
P.J. Schweitzer, A survey of aggregation-disaggregation in large Markov chains,Proc. 1st Int. Workshop on the Numerical Solution of Markov Chains, University of North Carolina, Raleigh, North Carolina (January 8–10, 1990) pp. 53–80.
P.J. Schweitzer, An iterative aggregation-disaggregation algorithm for discounted Markov renewal programming, in preparation.
P.J. Schweitzer, M. Puterman and K.W. Kindle, Iterative aggregation-disaggregation procedures for discounted semi-Markovian reward processes, Oper. Res. 33 (1985) 589–605.
P.J. Schweitzer and K.W. Kindle, An iterative aggregation-disaggregation algorithm for solving linear equations, Appl. Math. Comp. 18 (1986) 313–354.
P.J. Schweitzer, U. Sumita and K. Ohno, Replacement process decomposition for relative values in discounted Markov renewal programming, in preparation.
P.J. Schweitzer, U. Sumita and K. Ohno, Replacement process decomposition for undiscounted Markov renewal programming, in preparation.
L.P. Seelen, An algorithm forPh/Ph/c queues, Europ. J. Oper. Res. 23 (1986) 118–127.
U. Sumita and M. Rieders, A new algorithm for computing the ergodic probability vector for large Markov chains: replacement process approach, Prob. Eng. Infor. Sci. 4 (1990) 89–116.
U. Sumita and M. Rieders, Application of the replacement process approach for computing the ergodic probability vector for large-scale row-continuous Markov chains, University of Rochester, W.E. Simon Graduate School of Business Administration, Working Paper Series QM 88-10 (revised April 1990).
U. Sumita and M. Rieders, Numerical comparison of the replacement process approach with the aggregation-disaggregation algorithm for row-continuous Markov chains,Proc. 1st Int. Workshop on the Numerical Solution of Markov Chains, University of North Carolina, Raleigh, North Carolina (January 8–10, 1990) pp. 309–327.
U. Sumita, P.J. Schweitzer and K. Ohno, A replacement process approach for solving a large system of linear equations, in preparation.
Y. Takahashi, A lumping method for numerical calculations of stationary distributions of Markov chains, Research Report No. B-18, Dept. of Information Sciences, Tokyo Institute of Technology (1975).
K.-H. Waldman, On bounds for dynamic programs, Math. Oper. Res. 10 (1985) 220–232.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Schweitzer, P.J., Sumita, U. & Ohno, K. Replacement process decomposition for discounted Markov renewal programming. Ann Oper Res 29, 631–645 (1991). https://doi.org/10.1007/BF02283617
Issue Date:
DOI: https://doi.org/10.1007/BF02283617