Abstract
Variations of the multiarmed bandit problem are introduced and a sequence of results leading to the work of Lai and Robbins and its extentions is summarized. The guiding concern is to determine the optimal tradeoff between taking actions that maximize immediate rewards based on current information about unknown system parameters and making experiments that may reduce immediate rewards but improve parameter estimates.
Preview
Unable to display preview. Download preview PDF.
References
Anantharam. V., Ph.D. Dissertation, Univ. of California, Berkeley, 1986.
Gittins, J.C., "Bandit processes and dynamic allocation indices," J. Roy. Statist. Soc., vol. 41, 1979, 148–177.
Gittins, J.C. and D.M. Jones, "A dynamic allocation index for the sequential design of experiments," in Gani, J., K. Sarkadi and I. Vince, Eds., Progress in Statistics, Euro. Meet. Statist., vol. 1, New York, North-Holland, 1972, 241–266.
Lai, T.L., "Some thoughts on stochastic adaptive control," Proc. 23rd IEEE Conf. on Decision and Control, Las Vegas, Dec. 1984, 51–56.
Lai, T.L. and H. Robbins, "Asymptotically efficient adaptive allocation rules," Adv. Appl. Math., vol. 6, 1985, 4–22.
Lai, T.L. and H. Robbins, "Asymptotically efficient allocation of treatments in sequential experiments," in Santner, T.J. and A.C. Tamhane (eds) Design of Experiments, New York, Marcel Dekker, 1985, 127–142.
Varaiya, P., J.C. Walrand and C. Buyukkoc. "Extensions of the multiarmed bandit problem," IEEE Trans. Automat. Contr., vol. AC-30, May 1985, 426–439.
Weitzman, M.L., "Optimal search for the best alternative," Econometrica, vol. 47, 1979, 641–654.
Whittle, P., "Multi-armed bandits and the Gittins index.” J. Roy. Statist. Soc., vol. 42, 1980, 143–149.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1988 Springer-Verlag
About this paper
Cite this paper
Anantharam, V., Varaiya, P. (1988). Asymptotically efficient rules in multiarmed bandit problems. In: Byrnes, C.I., Kurzhanski, A.B. (eds) Modelling and Adaptive Control. Lecture Notes in Control and Information Sciences, vol 105. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0043173
Download citation
DOI: https://doi.org/10.1007/BFb0043173
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-19019-6
Online ISBN: 978-3-540-38904-0
eBook Packages: Springer Book Archive