Abstract
In this paper we consider a finite optimal stopping problem with unknown stationary transition probabilities. The payoffs are assumed to be known. We estimate the value of stationary deterministic decision rules, and then we obtain estimators of an optimal decision rule and the optimal value of the problem that are consistent with probability one. Two different methods are studied: the maximum likelihood estimator and a new procedure, that we will call the stretch estimator, which turns out to be a more efficient technique.
Similar content being viewed by others
References
Billingsley, P. (1961).Statistical Inference for Markov Processes. University of Chicago Press, Chicago.
Feller, W. (1968).An Introduction to Probability Theory and Its Applications, vol. 1. John Wiley & Sons, New York.
Hernández-Lerma, O. andMarcus, S. (1985). Adaptative control of discounted Markov decision chains.Journal of Optimization Theory and Applications, 46:227–235.
Kurano, M. (1987). Learning algorithms for Markov decision processes.Journal of Applied Probability, 24:270–276.
Luenberger, D. (1979).Introduction to Dynamic Systems. John Wiley & Sons, New York.
Mandl, P. (1974). Estimation and control in markov chains.Advances in Applied Probability, 6:40–60.
Puterman, M. (1994).Markov Decision Processes Wiley, New York.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Rumeau, T.P. Statistical inference for a finite optimal stopping problem with unknown transition probabilities. Test 12, 215–239 (2003). https://doi.org/10.1007/BF02595820
Received:
Accepted:
Issue Date:
DOI: https://doi.org/10.1007/BF02595820