Abstract
A bandit problem with infinitely many Bernoulli arms is considered. The parameters of Bernoulli arms are independent and identically distributed random variables from a common distribution with beta(a, b). We investigate the k-failure strategy which is a modification of Robbins's stay-with-a-winner/switch-on-a-loser strategy and three other strategies proposed recently by Berry et al. (1997, Ann. Statist., 25, 2103–2116). We show that the k-failure strategy performs poorly when b is greater than 1, and the best strategy among the k-failure strategies is the 1-failure strategy when b is less than or equal to 1. Utilizing the formulas derived by Berry et al. (1997), we obtain the asymptotic expected failure rates of these three strategies for beta prior distributions. Numerical estimations and simulations for a variety of beta prior distributions are presented to illustrate the performances of these strategies.
Similar content being viewed by others
References
Banks, J. S. and Sundaram, R. K. (1992). Denumerable-armed bandits, Econometrica, 60, 1071–1096.
Berry, D. A. and Fristedt, B. (1985). Bandit Problems: Sequential Allocations of Experiments, Chapman and Hall, London.
Berry, D. A., Chen, R. W., Zame, A., Heath, D. C. and Shepp, L. A. (1997). Bandit problems with infinitely many arms, Ann. Statist., 25, 2103–2116.
Gittins, J. C. (1989). Multi-armed Bandit Allocation Indices, Wiley, New York.
Herschkorn, S. J., Pekoz, E. and Ross, S. M. (1995). Policies without memory for the infinite-armed Bernoulli bandit under the average-reward criterion, Probab. Engrg. Inform. Sci., 10, 21–28.
Robbins, H. (1952). Some aspects of the sequential design of experiments, Bull. Amer. Math. Soc., 58, 527–536.
Author information
Authors and Affiliations
About this article
Cite this article
Lin, CT., Shiau, C.J. Some Optimal Strategies for Bandit Problems with Beta Prior Distributions. Annals of the Institute of Statistical Mathematics 52, 397–405 (2000). https://doi.org/10.1023/A:1004130209258
Issue Date:
DOI: https://doi.org/10.1023/A:1004130209258