Abstract
In this paper we consider finite-stage stochastic optimization problems of utility criterion, which is the stochastic evaluation of associative reward through a utility function. We optimize the expected value of a utility criterion not in the class of Markov policies but in the class of general policies. We show that, by expanding the state space, an invariant imbedding approach yields an recursive relation between two adjacent optimal value functions. We show that the utility problem with a general policy is equivalent to a terminal problem with a Markov policy on the augmented state space. Finally it is shown that the utility problem has an optimal policy in the class of general policies on the original state space.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
R.E. Bellman, Dynamic Programming, Princeton Univ. Press, NJ, 1957.
D. Blackwell, Discounted dynamic programming, Ann. Math. Stat. 36(1965), 226–235.
E.V. Denardo, Contraction mappings in the theory underlying dynamic programming, SIAM Review 9(1968), 165–177.
E.V. Denardo, Dynamic Programming: Models and Applications, Prentice-Hall, N.J., 1982.
T. Fujita, Re-examination of Markov policied for additive decision process, Bull. Informatics and Cybernetics, 29(1997), 51–66.
N. Furukawa and S. Iwamoto, Markovian decision processes with recursive reward functions, Bull. Math. Statist. 15(1973), 79–91.
N. Furukawa and S. Iwamoto, Dynamic programming on recursive reward systems, Bull. Math. Statist. 17(1976), 103–126.
K. Hinderer, Foundations of Non-Stationary Dynamic Programming with Discrete Time Parameter, Lect. Notes in Operation Research and Mathematical Systems, Vol. 33, Springer-Verlag, Berlin, 1970.
R. A. Howard, Dynamic Programming and Markov Processes, MIT Press, Cambridge, Mass., 1960.
S. Iwamoto, Discrete dynamic programming with recursive additive system, Bull. Math. Statist. 15(1974), 49–66.
S. Iwamoto, Associative dynamic programs, J. Math. Anal. Appl., 201(1996), 195–211.
S. Iwamoto, On expected values of Markov statistics, Bull. Informatics and Cybernetics, 30(1998), 1–24.
S. Iwamoto, Conditional decision processes with recursive reward function, J. Math. Anal. Appl., 230(1999), 193–210.
S. Iwamoto and T. Fujita, Stochastic decision-making in a fuzzy environment, J. Operations Res. Soc. Japan 38(1995), 467–482.
S. Iwamoto and M. Sniedovich, Sequential decision making in fuzzy environment, J. Math. Anal. Appl., 222(1998), 208–224.
S. Iwamoto, K. Tsurusaki and T. Fujita, Conditional decision-making in a fuzzy environment, J. Operations Res. Soc. Japan 42 (1999), 198–218.
E. S. Lee, Quasilinearization and Invariant Imbedding, Academic Press, New York, 1968.
D.M. Kreps, Decision problems with expected utility criteria, I, Math. Oper. Res. 2(1977), 45–53.
D.M. Kreps, Decision problems with expected utility criteria, II; stationarity, Math. Oper. Res. 2(1977), 266–274.
W. Lipfert, Über ein stochastisches dynamiches Entscheidungsmodell mit allegemeinen Ertragsfunktionalen, Optimization 16(1985), 313–328.
A.S. Nowak, On a general dynamic programming problem, Colloquium Mathematicum, 37 (1977), Fasc.1, 131–138.
E. Porteus, An informal look at the princiole of optimality, Management Sci. 21(1975), 1346–1348.
E. Porteus, Conditions for characterizing the structure of optimal strategies in infinite-horizon dynamic programs, J. Opt. Theo. Anal. 36(1982), 419–432.
M.L. Puterman, Markov Decision Processes: Stochastic Models, Chap. VIII, D.P. Heyman and M.J. Sobel (Ed’s), Handbooks in Operations Research and Management Science Vol. 2, Elsevier, Amsterdam, 1990.
M.L. Puterman, Markov Decision Processes: discrete stochastic dynamic programming, Wiley & Sons, New York, 1994.
U. Rieder, Non-cooperative dynamic games with general utility functions, Stochastic Games and Related Topics; T.E.S. Raghavan et. al.(eds), Kluwer Academic Publishers, 1991, 161–174.
M. Sniedovich, Dynamic Programming, Marcel Dekker, Inc. NY, 1992.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Kluwer Academic Publishers
About this chapter
Cite this chapter
Iwamoto, S., Ueno, T., Fujita, T. (2002). Controlled Markov Chains with Utility Functions. In: Hou, Z., Filar, J.A., Chen, A. (eds) Markov Processes and Controlled Markov Chains. Springer, Boston, MA. https://doi.org/10.1007/978-1-4613-0265-0_8
Download citation
DOI: https://doi.org/10.1007/978-1-4613-0265-0_8
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4613-7968-3
Online ISBN: 978-1-4613-0265-0
eBook Packages: Springer Book Archive