The policy iteration method for the optimal stopping of a markov chain with an application

  • K. M. van Hee
Optimal Design
Part of the Lecture Notes in Computer Science book series (LNCS, volume 41)


In this paper we study the problem of the optimal stopping of a Markov chain with a countable state space. In each state i the controller receives a reward r(i) if he stops the process or he must pay the cost c(i) otherwise. We show that, under the condition that there exists an optimal stopping rule, the policy iteration method, introduced by Howard, produces a sequence of stopping rules for which the expected return converges to the value function. For random walks on the integers with a special reward and cost structure, we show that the policy iteration method gives the solution of a discrete two point boundary value problem with a free boundary. We give a simple algorithm for the computation of the optimal stopping rule.


  1. [1]
    DYNKIN, E.B., JUSCHKEWITSCH, A.A.; Sätze und Aufgaben über Markoffsche Prozesse. Springer-Verslag (1969).Google Scholar
  2. [2]
    HORDIJK, A., POTHARST, R., RUNNENBURG, J. Th.; Optimaal stoppen van Markov ketens. MC-syllabus19 (1973).Google Scholar
  3. [3]
    HORDIJK, A.; Dynamic programming and Markov potential theory. MC tract (1974).Google Scholar
  4. [4]
    HOWARD, R.A.; Dynamic programming and Markov processes. Technology Press, Cambridge Massachusetts (1960).Google Scholar
  5. [5]
    VAN HEE, K.M., HORDIJK, A.; A sequential sampling problem solved by optimal stopping. MC-rapport SW 25/73 (1973).Google Scholar
  6. [6]
    VAN HEE, K.M.; Note on memoryless stopping rules. COSOR-notitie R-73-12, T.H. Eindhoven (1974).Google Scholar
  7. [7]
    ROSS, S.; Applied probability models with optimization applications. Holden-Day (1970).Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1976

Authors and Affiliations

  • K. M. van Hee

There are no affiliations available

Personalised recommendations