Skip to main content

Controlled Markov Chains with Utility Functions

  • Chapter
Markov Processes and Controlled Markov Chains

Abstract

In this paper we consider finite-stage stochastic optimization problems of utility criterion, which is the stochastic evaluation of associative reward through a utility function. We optimize the expected value of a utility criterion not in the class of Markov policies but in the class of general policies. We show that, by expanding the state space, an invariant imbedding approach yields an recursive relation between two adjacent optimal value functions. We show that the utility problem with a general policy is equivalent to a terminal problem with a Markov policy on the augmented state space. Finally it is shown that the utility problem has an optimal policy in the class of general policies on the original state space.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. R.E. Bellman, Dynamic Programming, Princeton Univ. Press, NJ, 1957.

    MATH  Google Scholar 

  2. D. Blackwell, Discounted dynamic programming, Ann. Math. Stat. 36(1965), 226–235.

    Article  MathSciNet  MATH  Google Scholar 

  3. E.V. Denardo, Contraction mappings in the theory underlying dynamic programming, SIAM Review 9(1968), 165–177.

    Article  MathSciNet  Google Scholar 

  4. E.V. Denardo, Dynamic Programming: Models and Applications, Prentice-Hall, N.J., 1982.

    Google Scholar 

  5. T. Fujita, Re-examination of Markov policied for additive decision process, Bull. Informatics and Cybernetics, 29(1997), 51–66.

    MATH  Google Scholar 

  6. N. Furukawa and S. Iwamoto, Markovian decision processes with recursive reward functions, Bull. Math. Statist. 15(1973), 79–91.

    MathSciNet  MATH  Google Scholar 

  7. N. Furukawa and S. Iwamoto, Dynamic programming on recursive reward systems, Bull. Math. Statist. 17(1976), 103–126.

    MATH  Google Scholar 

  8. K. Hinderer, Foundations of Non-Stationary Dynamic Programming with Discrete Time Parameter, Lect. Notes in Operation Research and Mathematical Systems, Vol. 33, Springer-Verlag, Berlin, 1970.

    MATH  Google Scholar 

  9. R. A. Howard, Dynamic Programming and Markov Processes, MIT Press, Cambridge, Mass., 1960.

    MATH  Google Scholar 

  10. S. Iwamoto, Discrete dynamic programming with recursive additive system, Bull. Math. Statist. 15(1974), 49–66.

    MathSciNet  Google Scholar 

  11. S. Iwamoto, Associative dynamic programs, J. Math. Anal. Appl., 201(1996), 195–211.

    Article  MathSciNet  MATH  Google Scholar 

  12. S. Iwamoto, On expected values of Markov statistics, Bull. Informatics and Cybernetics, 30(1998), 1–24.

    MathSciNet  MATH  Google Scholar 

  13. S. Iwamoto, Conditional decision processes with recursive reward function, J. Math. Anal. Appl., 230(1999), 193–210.

    Article  MathSciNet  MATH  Google Scholar 

  14. S. Iwamoto and T. Fujita, Stochastic decision-making in a fuzzy environment, J. Operations Res. Soc. Japan 38(1995), 467–482.

    MathSciNet  MATH  Google Scholar 

  15. S. Iwamoto and M. Sniedovich, Sequential decision making in fuzzy environment, J. Math. Anal. Appl., 222(1998), 208–224.

    Article  MathSciNet  MATH  Google Scholar 

  16. S. Iwamoto, K. Tsurusaki and T. Fujita, Conditional decision-making in a fuzzy environment, J. Operations Res. Soc. Japan 42 (1999), 198–218.

    Article  MathSciNet  MATH  Google Scholar 

  17. E. S. Lee, Quasilinearization and Invariant Imbedding, Academic Press, New York, 1968.

    MATH  Google Scholar 

  18. D.M. Kreps, Decision problems with expected utility criteria, I, Math. Oper. Res. 2(1977), 45–53.

    Article  MathSciNet  MATH  Google Scholar 

  19. D.M. Kreps, Decision problems with expected utility criteria, II; stationarity, Math. Oper. Res. 2(1977), 266–274.

    Article  MathSciNet  MATH  Google Scholar 

  20. W. Lipfert, Über ein stochastisches dynamiches Entscheidungsmodell mit allegemeinen Ertragsfunktionalen, Optimization 16(1985), 313–328.

    Article  MathSciNet  MATH  Google Scholar 

  21. A.S. Nowak, On a general dynamic programming problem, Colloquium Mathematicum, 37 (1977), Fasc.1, 131–138.

    MathSciNet  MATH  Google Scholar 

  22. E. Porteus, An informal look at the princiole of optimality, Management Sci. 21(1975), 1346–1348.

    Article  MathSciNet  Google Scholar 

  23. E. Porteus, Conditions for characterizing the structure of optimal strategies in infinite-horizon dynamic programs, J. Opt. Theo. Anal. 36(1982), 419–432.

    Article  MathSciNet  MATH  Google Scholar 

  24. M.L. Puterman, Markov Decision Processes: Stochastic Models, Chap. VIII, D.P. Heyman and M.J. Sobel (Ed’s), Handbooks in Operations Research and Management Science Vol. 2, Elsevier, Amsterdam, 1990.

    Google Scholar 

  25. M.L. Puterman, Markov Decision Processes: discrete stochastic dynamic programming, Wiley & Sons, New York, 1994.

    MATH  Google Scholar 

  26. U. Rieder, Non-cooperative dynamic games with general utility functions, Stochastic Games and Related Topics; T.E.S. Raghavan et. al.(eds), Kluwer Academic Publishers, 1991, 161–174.

    Chapter  Google Scholar 

  27. M. Sniedovich, Dynamic Programming, Marcel Dekker, Inc. NY, 1992.

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2002 Kluwer Academic Publishers

About this chapter

Cite this chapter

Iwamoto, S., Ueno, T., Fujita, T. (2002). Controlled Markov Chains with Utility Functions. In: Hou, Z., Filar, J.A., Chen, A. (eds) Markov Processes and Controlled Markov Chains. Springer, Boston, MA. https://doi.org/10.1007/978-1-4613-0265-0_8

Download citation

  • DOI: https://doi.org/10.1007/978-1-4613-0265-0_8

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-1-4613-7968-3

  • Online ISBN: 978-1-4613-0265-0

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics