Advertisement

Introduction

  • Eugene A. Feinberg
  • Adam Shwartz
Part of the International Series in Operations Research & Management Science book series (ISOR, volume 40)

Abstract

This volume deals with the theory of Markov Decision Processes (MDPs) and their applications. Each chapter was written by a leading expert in the respective area. The papers cover major research areas and methodologies, and discuss open questions and future research directions. The papers can be read independently, with the basic notation and concepts of Section 1.2. Most chap- ters should be accessible by graduate or advanced undergraduate students in fields of operations research, electrical engineering, and computer science.

Keywords

Markov Decision Process Reward Function Stochastic Game Optimality Equation Average Reward 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. [1]
    E. Altman, Constrained Markov Decision Processes, Chapman & Hall/CRC, Boca Raton, 1999.Google Scholar
  2. [2]
    R. Anupindi and Y. Bassok, “Supply contracts with quantity commitments and stochastic demand,” in Quantitative Models for Supply Chain Management (S. Tayur, R. Ganeshan, M. Magazine, eds.), pp. 197–232, Kluwer, Boston, 1999.CrossRefGoogle Scholar
  3. [3]
    K.J. Arrow, D. Blackwell, and M.A. Girshick, “Bayes and minimax so- lutions of sequential decision processes,” Econometrica 17, pp. 213–244, 1949.CrossRefGoogle Scholar
  4. [4]
    K.J. Arrow, T. Harris, and J. Marschak, “Optimal inventory policies,” Econometrica 19, pp. 250–272, 1951.CrossRefGoogle Scholar
  5. [5]
    J. Bather, “Optimal decision procedures for finite Markov chains I,” Adv. Appl. Prob. 5, pp. 328–339, 1973.CrossRefGoogle Scholar
  6. [6]
    R.E. Bellman, Dynamic Programming, Princeton University Press, Princeton, 1957.Google Scholar
  7. [7]
    R.E. Bellman and D. Blackwell, “On a particular non-zero sum game,” RM-250, RAND Corp., Santa Monica, 1949.Google Scholar
  8. [8]
    R.E. Bellman and J.P. LaSalle, “On non-zero sum games and stochastic processes,” RM-212, RAND Corp., Santa Monica, 1949.Google Scholar
  9. [9]
    D.P. Bertsekas, Dynamic Programming and Optimal Control: Volume I, Athena Scientific, Bellmont, MA, 2000 (second edition).Google Scholar
  10. [10]
    D.P. Bertsekas, Dynamic Programming and Optimal Control: Volume II, Athena Scientific, Bellmont, MA, 1995.Google Scholar
  11. [11]
    D.P. Bertsekas and S.E. Shreve, Stochastic Optimal Control: The Discrete-Time Case, Academic Press, New York, 1978 (republished by Athena Scientific, 1997).Google Scholar
  12. [12]
    D.P. Bertsekas and J.N. Tsitsiklis, Neuro-Dynamic Programming, Athena Scientific, Bellmont, MA, 1996.Google Scholar
  13. [13]
    K.-J. Bierth, “An expected average reward criterion”, Stochastic Processes and Applications 26, pp. 133–140, 1987.CrossRefGoogle Scholar
  14. [14]
    D. Blackwell, “Discrete dynamic programming,” Ann. Math. Stat. 33, pp. 719–726, 1962.CrossRefGoogle Scholar
  15. [15]
    D. Blackwell, “The stochastic processes of Borel gambling and dynamic programming,” Annals of Statistics 4, pp. 370–374, 1976.CrossRefGoogle Scholar
  16. [16]
    V.S. Borkar, Topics in Controlled Markov Chains, Pitman research Notes in Math., 240, Longman Scientific and Technical, Harlow, 1991.Google Scholar
  17. [17]
    R.Ya. Chitashvili, “A controlled finite Markov chain with an arbitrary set of decisions,” SIAM Theory Prob. Appl. 20, pp. 839–846, 1975.CrossRefGoogle Scholar
  18. [18]
    K.L. Chung, Markov Chains with Stationary Transition Probabilities, Springer-Verlag, Berlin, 1960.CrossRefGoogle Scholar
  19. [19]
    C. Derman, “On sequential decisions and Markov chains,” Man. Sci. 9, pp. 16–24, 1962.CrossRefGoogle Scholar
  20. [20]
    L.E. Dubins and L.J. Savage, How to Gamble if You Must: Inequalities for Stochastic Processes, McGraw-Hill, New York, 1965.Google Scholar
  21. [21]
    A. Dvoretsky, J. Kiefer, and J. Wolfowitz, “The inventory problem: I. Case of known distribution of demand,” Econometrica 20, pp. 187–222, 1952.CrossRefGoogle Scholar
  22. [22]
    E.B. Dynkin and A.A. Yushkevich, Controlled Markov Processes, Springer-Verlag, New York, 1979 (translation from 1975 Russian edition).CrossRefGoogle Scholar
  23. [23]
    A. Federgruen, “Centralized planning models for multi-echelon inventory systems under inventory,” in Logistics of Production and Inventory, (S.C. Graves, A.H.G. Rinnooy Kan, and P.H. Zipkin, eds), Handbooks in Operations Research and Management Science, 4, pp. 133–173, North-Holland, Amsterdam, 1993.CrossRefGoogle Scholar
  24. [24]
    E. A. Feinberg, “The existence of a stationary ε-optimal policy for a finite Markov chain,” SIAM Theory Prob. Appl. 23, pp. 297–313, 1978.CrossRefGoogle Scholar
  25. [25]
    E.A. Feinberg, “An ε-optimal control of a finite Markov chain with an average reward criterion,” SIAM Theory Prob. Appl. 25, pp. 70–81, 1980.CrossRefGoogle Scholar
  26. [26]
    E.A. Feinberg, “Controlled Markov processes with arbitrary numerical criteria,” SIAM Theory Prob. Appl. 27 pp. 486–503, 1982.CrossRefGoogle Scholar
  27. [27]
    E.A. Feinberg and H. Park, “Finite state Markov decision models with average reward criteria,” Stoch. Processes Appl., 31 pp. 159–177, 1994.CrossRefGoogle Scholar
  28. [28]
    J.A. Filar and D. Krass, “Hamiltonian cycles and Markov chains,” Math. Oper. Res. 19, pp. 223–237, 1994.CrossRefGoogle Scholar
  29. [29]
    O. Hernández-Lerma, Adaptive Markov Control Processes, Springer, New York, 1989.CrossRefGoogle Scholar
  30. [30]
    O. Hernández-Lerma and J.B. Lasserre, Further Topics in Discrete-Time Markov Control Processes, Springer, New York, 1999.CrossRefGoogle Scholar
  31. [31]
    D.P. Heyman and M.J. Sobel, Stochastic Methods in Operations Research. Volume II: Stochastic Optimization, McGraw-Hill, New York, 1984.Google Scholar
  32. [32]
    K. Hinderer, Foundations of Non-stationary Dynamic Programming with Discrete Time Parameter, Springer-Verlag, New York, 1970.Google Scholar
  33. [33]
    R. A. Howard Dynamic Programming and Markov Processes, MIT Press, Cambridge, 1960.Google Scholar
  34. [34]
    L.C.M. Kallenberg, Linear Programming and Finite Markovian Control Problems, Mathematical Centre Tract 148, Mathematical Centre, Amsterdam, 1983.Google Scholar
  35. [35]
    A.S. Kechris, Classical Descriptive Set Theory, Springer-Verlag, New York, 1995.CrossRefGoogle Scholar
  36. [36]
    M.Yu. Kitaev and V.V. Rykov, Controlled Queueing Systems, CRC Press, Boca Raton, 1995.Google Scholar
  37. [37]
    A.J. Kleywegt and J.D. Papastavrou, “Acceptance and dispatching poli- cies for a distribution problem”, Transportation Science, 32, pp. 127–141, 1998.CrossRefGoogle Scholar
  38. [38]
    A.P Maitra and W.D. Sudderth, Discrete Gambling and Stochastic Games, Springer, New York, 1996.CrossRefGoogle Scholar
  39. [39]
    J. Neveu, Mathematical Foundations of the Calculus of Probability, Holden-Day, San Francisco, 1965.Google Scholar
  40. [40]
    A.B. Piunovskiy, Optimal Control of Random Sequences in Problems with Constraints, Kluwer, Dordrecht, 1997.CrossRefGoogle Scholar
  41. [41]
    A.B. Piunovskiy and X. Mao, “Constrained Markovian decision processes: the dynamic programming approach,” Operations Research Letters 27, pp. 119–126, 2000.CrossRefGoogle Scholar
  42. [42]
    E.L. Porteus, “Stochastic inventory theory,” in Stochastic Models, (D.P. Heyman and M.J. Sobel, eds), Handbooks in Operations Research and Management Science, 2, pp. 605–652, North-Holland, Amsterdam, 1990.CrossRefGoogle Scholar
  43. [43]
    M.L. Puterman, Markov Decision Processes, Wiley, New York, 1994.CrossRefGoogle Scholar
  44. [44]
    M. Schäl, “On stochastic dynamic programming: a bridge between Markov decision processes and gambling.” Markov processes and control theory, pp. 178–216, Math. Res. 54, Akademie-Verlag, Berlin, 1989.Google Scholar
  45. [45]
    L. Sennott, Stochastic Dynamic Programming and the Control of Queueing Systems, Wiley, New York, 1999.Google Scholar
  46. [46]
    S. Ross, Introduction to Stochastic Dynamic Programming, Academic Press, New York, 1983.Google Scholar
  47. [47]
    L.S. Shapley, “Stochastic games,” Proceedings of the National Academy of Sciences, pp. 1095–1100, 1953.Google Scholar
  48. [48]
    A.N. Shiryaev, “On the theory of decision functions and control by an observation process with incomplete data,” Selected Translations in Math. Statistics and Probability 6, pp. 162–188, 1966.Google Scholar
  49. [49]
    N.L. Stokey and R.E. Lucas, Jr. Recursive Methods in Economic Dynamics, Harvard University Press, Cambridge, 1989.Google Scholar
  50. [50]
    A. Wald, Sequential Analysis, Wiley, New York, 1947.Google Scholar
  51. [51]
    P. Whittle, Risk-Sensitive Optimal Control, Wiley, NY, 1990Google Scholar

Copyright information

© Springer Science+Business Media New York 2003

Authors and Affiliations

  • Eugene A. Feinberg
    • 1
  • Adam Shwartz
    • 2
  1. 1.Department of Applied Mathematics and StatisticsSUNY at Stony BrookStony BrookUSA
  2. 2.Department of Electrical EngineeringTechnion-Israel Institute of TechnologyHaifaIsrael

Personalised recommendations