Controlled Markov Chains with Utility Functions

Iwamoto, Seiichi; Ueno, Takayuki; Fujita, Toshiharu

doi:10.1007/978-1-4613-0265-0_8

Seiichi Iwamoto⁴,
Takayuki Ueno⁴ &
Toshiharu Fujita⁵

761 Accesses
5 Citations

Abstract

In this paper we consider finite-stage stochastic optimization problems of utility criterion, which is the stochastic evaluation of associative reward through a utility function. We optimize the expected value of a utility criterion not in the class of Markov policies but in the class of general policies. We show that, by expanding the state space, an invariant imbedding approach yields an recursive relation between two adjacent optimal value functions. We show that the utility problem with a general policy is equivalent to a terminal problem with a Markov policy on the augmented state space. Finally it is shown that the utility problem has an optimal policy in the class of general policies on the original state space.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

R.E. Bellman, Dynamic Programming, Princeton Univ. Press, NJ, 1957.
MATH Google Scholar
D. Blackwell, Discounted dynamic programming, Ann. Math. Stat. 36(1965), 226–235.
Article MathSciNet MATH Google Scholar
E.V. Denardo, Contraction mappings in the theory underlying dynamic programming, SIAM Review 9(1968), 165–177.
Article MathSciNet Google Scholar
E.V. Denardo, Dynamic Programming: Models and Applications, Prentice-Hall, N.J., 1982.
Google Scholar
T. Fujita, Re-examination of Markov policied for additive decision process, Bull. Informatics and Cybernetics, 29(1997), 51–66.
MATH Google Scholar
N. Furukawa and S. Iwamoto, Markovian decision processes with recursive reward functions, Bull. Math. Statist. 15(1973), 79–91.
MathSciNet MATH Google Scholar
N. Furukawa and S. Iwamoto, Dynamic programming on recursive reward systems, Bull. Math. Statist. 17(1976), 103–126.
MATH Google Scholar
K. Hinderer, Foundations of Non-Stationary Dynamic Programming with Discrete Time Parameter, Lect. Notes in Operation Research and Mathematical Systems, Vol. 33, Springer-Verlag, Berlin, 1970.
MATH Google Scholar
R. A. Howard, Dynamic Programming and Markov Processes, MIT Press, Cambridge, Mass., 1960.
MATH Google Scholar
S. Iwamoto, Discrete dynamic programming with recursive additive system, Bull. Math. Statist. 15(1974), 49–66.
MathSciNet Google Scholar
S. Iwamoto, Associative dynamic programs, J. Math. Anal. Appl., 201(1996), 195–211.
Article MathSciNet MATH Google Scholar
S. Iwamoto, On expected values of Markov statistics, Bull. Informatics and Cybernetics, 30(1998), 1–24.
MathSciNet MATH Google Scholar
S. Iwamoto, Conditional decision processes with recursive reward function, J. Math. Anal. Appl., 230(1999), 193–210.
Article MathSciNet MATH Google Scholar
S. Iwamoto and T. Fujita, Stochastic decision-making in a fuzzy environment, J. Operations Res. Soc. Japan 38(1995), 467–482.
MathSciNet MATH Google Scholar
S. Iwamoto and M. Sniedovich, Sequential decision making in fuzzy environment, J. Math. Anal. Appl., 222(1998), 208–224.
Article MathSciNet MATH Google Scholar
S. Iwamoto, K. Tsurusaki and T. Fujita, Conditional decision-making in a fuzzy environment, J. Operations Res. Soc. Japan 42 (1999), 198–218.
Article MathSciNet MATH Google Scholar
E. S. Lee, Quasilinearization and Invariant Imbedding, Academic Press, New York, 1968.
MATH Google Scholar
D.M. Kreps, Decision problems with expected utility criteria, I, Math. Oper. Res. 2(1977), 45–53.
Article MathSciNet MATH Google Scholar
D.M. Kreps, Decision problems with expected utility criteria, II; stationarity, Math. Oper. Res. 2(1977), 266–274.
Article MathSciNet MATH Google Scholar
W. Lipfert, Über ein stochastisches dynamiches Entscheidungsmodell mit allegemeinen Ertragsfunktionalen, Optimization 16(1985), 313–328.
Article MathSciNet MATH Google Scholar
A.S. Nowak, On a general dynamic programming problem, Colloquium Mathematicum, 37 (1977), Fasc.1, 131–138.
MathSciNet MATH Google Scholar
E. Porteus, An informal look at the princiole of optimality, Management Sci. 21(1975), 1346–1348.
Article MathSciNet Google Scholar
E. Porteus, Conditions for characterizing the structure of optimal strategies in infinite-horizon dynamic programs, J. Opt. Theo. Anal. 36(1982), 419–432.
Article MathSciNet MATH Google Scholar
M.L. Puterman, Markov Decision Processes: Stochastic Models, Chap. VIII, D.P. Heyman and M.J. Sobel (Ed’s), Handbooks in Operations Research and Management Science Vol. 2, Elsevier, Amsterdam, 1990.
Google Scholar
M.L. Puterman, Markov Decision Processes: discrete stochastic dynamic programming, Wiley & Sons, New York, 1994.
MATH Google Scholar
U. Rieder, Non-cooperative dynamic games with general utility functions, Stochastic Games and Related Topics; T.E.S. Raghavan et. al.(eds), Kluwer Academic Publishers, 1991, 161–174.
Chapter Google Scholar
M. Sniedovich, Dynamic Programming, Marcel Dekker, Inc. NY, 1992.
MATH Google Scholar

Download references

Author information

Authors and Affiliations

Department of Economic Engineering, Graduate School of Economics Kyushu University 27, Fukuoka, 812-8581, Japan
Seiichi Iwamoto & Takayuki Ueno
Department of Electric, Electronic and Computer Engineering, Kyushu Institute of Technology, Kitakyushu, 804-8550, Japan
Toshiharu Fujita

Authors

Seiichi Iwamoto
View author publications
You can also search for this author in PubMed Google Scholar
Takayuki Ueno
View author publications
You can also search for this author in PubMed Google Scholar
Toshiharu Fujita
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Research Department, Changsha Railway University, Changsha, China
Zhenting Hou
School of Mathematics, University of South Australia, Mawson Lakes, SA, Australia
Jerzy A. Filar
School of Computing and Mathematical Sciences, University of Greenwich, London, UK
Anyue Chen

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Iwamoto, S., Ueno, T., Fujita, T. (2002). Controlled Markov Chains with Utility Functions. In: Hou, Z., Filar, J.A., Chen, A. (eds) Markov Processes and Controlled Markov Chains. Springer, Boston, MA. https://doi.org/10.1007/978-1-4613-0265-0_8

Download citation

DOI: https://doi.org/10.1007/978-1-4613-0265-0_8
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4613-7968-3
Online ISBN: 978-1-4613-0265-0
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics