Summary
This chapter examines a number of extensions of the multi-armed bandit framework. We consider the possibility of an infinite number of available arms, we give conditions under which the Gittins index strategy is well-defined, and we examine the optimality of that strategy. We then consider some difficulties arising from “parallel search,” in which a decision-maker may pull more than one arm per period, and from the introduction of a cost of switching between arms.
The questions addressed in this paper grew out of my work with Jeff Banks on Bandit problems and their applications (Branks and Sundaram [3, 4, 5]) and owe much to many discussions I had with him on this subject. I also had the benefit of several discussions with Andy McLennan, especially regarding the material in Sections 4 and 6 of this paper.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Agrawal, R., M. V. Hegde, and D. Teneketzis (1988) Asymptotically efficient adaptive allocation rules for the multiarmed bandit problem with switching costs. IEEE Transactions on Optimal Control, 33(10): 899–906.
Berry, D. and B. Fristedt (1985) Bandit Problems: Sequential Allocation of Experiments London: Chapman and Hall.
Banks, J. S. and R. K. Sundaram (1992a) Denumerable-armed bandits. Econometrica, 60(5): 1071–1096.
Banks, J. S. and R. K. Sundaram (1992b) A class of bandit problems yielding myopic optimal strategies. Journal of Applied Probability, 625–632.
Banks, J. S. and R. K. Sundaram (1994) Switching costs and the Gittins index. Econometrica, 62(3): 687–694.
Basu, A, A. Bose, and J. K. Ghosh (1990) An expository review of sequential design and allocation rules. Mimeo. Purdue University.
Blackwell, D. (1965) Discounted dynamic programming. Annals of Mathematical Statistics, 36: 225–235.
Feldman, D. (1962) Contributions to the “two-armed bandit” problem. Annals of Mathematical Statistics, 33: 847–856.
Feldman, M. and M. Spagat (1993), Optimal learning with costly adjustment. Mimeo. Brown University.
Gittins, J. (1979) Bandit processes and dynamic allocation indices, Journal of the Royal Statistical Society, Series B 41: 148–164.
Gittins, J. (1989) Allocation Indices for Multi-Armed Bandits London: Wiley.
Gittins, J. and D. Jones (1974) A dynamic allocation index for the sequential allocation of experiments. In J. Gani et al. (eds) Progress in Statistics Amsterdam: North Holland.
Kolonko, M. and H. Benzing (1983) The sequential design of Bernoulli experiments including switching costs. Mimeo.
Lai, T. L. and H. Robbins (1985) Asymptotically efficient adaptive allocation rules. Advances in Applied Mathematics, 6: 4–22.
Mortensen, D. (1985) Job search and labor market analysis. In O. Ashenfelter and J. Layard (eds) Handbook of Labor Economics Vol. II New York: North Holland.
Pressman, E. L. and I. M. Sonin (1990) Sequential Control with Partial Information New York: Academic Press.
Rieder, U. (1975) Bayesian dynamic programming. Advances in Applied Probability, 7: 330–348.
Rothschild, M. (1974) A two-armed bandit theory of market pricing. Journal of Economic Theory, 9: 185–202.
Schäl, M. (1979) Dynamic programming and statistical decision theory. Annals of Statistics, 7(2): 432–445.
Vicusi, W. (1979) Job hazards and worker quit rates: an analysis of adaptive worker behavior. International Economic Review, 20: 29–58.
Weizman, M. L. (1979) Optimal search for the best alternative. Econometrica, 47: 641–654.
Whittle, P. (1981) Arm-acquiring bandits. Annals of Probability, 9(2): 284–292.
Whittle, P. (1982) Optimization Over Time: Dynamic Programming and Stochastic Control Vol. I New York: Wiley.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Sundaram, R.K. (2005). Generalized Bandit Problems. In: Austen-Smith, D., Duggan, J. (eds) Social Choice and Strategic Decisions. Studies in Choice and Welfare. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-27295-X_6
Download citation
DOI: https://doi.org/10.1007/3-540-27295-X_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-22053-4
Online ISBN: 978-3-540-27295-3
eBook Packages: Business and EconomicsEconomics and Finance (R0)