Generalized Bandit Problems

Sundaram, Rangarajan K.

doi:10.1007/3-540-27295-X_6

Rangarajan K. Sundaram³

Part of the book series: Studies in Choice and Welfare ((WELFARE))

1109 Accesses
4 Citations

Summary

This chapter examines a number of extensions of the multi-armed bandit framework. We consider the possibility of an infinite number of available arms, we give conditions under which the Gittins index strategy is well-defined, and we examine the optimality of that strategy. We then consider some difficulties arising from “parallel search,” in which a decision-maker may pull more than one arm per period, and from the introduction of a cost of switching between arms.

The questions addressed in this paper grew out of my work with Jeff Banks on Bandit problems and their applications (Branks and Sundaram [3, 4, 5]) and owe much to many discussions I had with him on this subject. I also had the benefit of several discussions with Andy McLennan, especially regarding the material in Sections 4 and 6 of this paper.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Agrawal, R., M. V. Hegde, and D. Teneketzis (1988) Asymptotically efficient adaptive allocation rules for the multiarmed bandit problem with switching costs. IEEE Transactions on Optimal Control, 33(10): 899–906.
Article MathSciNet Google Scholar
Berry, D. and B. Fristedt (1985) Bandit Problems: Sequential Allocation of Experiments London: Chapman and Hall.
Google Scholar
Banks, J. S. and R. K. Sundaram (1992a) Denumerable-armed bandits. Econometrica, 60(5): 1071–1096.
MathSciNet Google Scholar
Banks, J. S. and R. K. Sundaram (1992b) A class of bandit problems yielding myopic optimal strategies. Journal of Applied Probability, 625–632.
Google Scholar
Banks, J. S. and R. K. Sundaram (1994) Switching costs and the Gittins index. Econometrica, 62(3): 687–694.
Google Scholar
Basu, A, A. Bose, and J. K. Ghosh (1990) An expository review of sequential design and allocation rules. Mimeo. Purdue University.
Google Scholar
Blackwell, D. (1965) Discounted dynamic programming. Annals of Mathematical Statistics, 36: 225–235.
MathSciNet Google Scholar
Feldman, D. (1962) Contributions to the “two-armed bandit” problem. Annals of Mathematical Statistics, 33: 847–856.
MATH Google Scholar
Feldman, M. and M. Spagat (1993), Optimal learning with costly adjustment. Mimeo. Brown University.
Google Scholar
Gittins, J. (1979) Bandit processes and dynamic allocation indices, Journal of the Royal Statistical Society, Series B 41: 148–164.
MATH MathSciNet Google Scholar
Gittins, J. (1989) Allocation Indices for Multi-Armed Bandits London: Wiley.
Google Scholar
Gittins, J. and D. Jones (1974) A dynamic allocation index for the sequential allocation of experiments. In J. Gani et al. (eds) Progress in Statistics Amsterdam: North Holland.
Google Scholar
Kolonko, M. and H. Benzing (1983) The sequential design of Bernoulli experiments including switching costs. Mimeo.
Google Scholar
Lai, T. L. and H. Robbins (1985) Asymptotically efficient adaptive allocation rules. Advances in Applied Mathematics, 6: 4–22.
Article MathSciNet Google Scholar
Mortensen, D. (1985) Job search and labor market analysis. In O. Ashenfelter and J. Layard (eds) Handbook of Labor Economics Vol. II New York: North Holland.
Google Scholar
Pressman, E. L. and I. M. Sonin (1990) Sequential Control with Partial Information New York: Academic Press.
Google Scholar
Rieder, U. (1975) Bayesian dynamic programming. Advances in Applied Probability, 7: 330–348.
Article MATH MathSciNet Google Scholar
Rothschild, M. (1974) A two-armed bandit theory of market pricing. Journal of Economic Theory, 9: 185–202.
Article MathSciNet Google Scholar
Schäl, M. (1979) Dynamic programming and statistical decision theory. Annals of Statistics, 7(2): 432–445.
MATH MathSciNet Google Scholar
Vicusi, W. (1979) Job hazards and worker quit rates: an analysis of adaptive worker behavior. International Economic Review, 20: 29–58.
Google Scholar
Weizman, M. L. (1979) Optimal search for the best alternative. Econometrica, 47: 641–654.
MathSciNet Google Scholar
Whittle, P. (1981) Arm-acquiring bandits. Annals of Probability, 9(2): 284–292.
Article MATH MathSciNet Google Scholar
Whittle, P. (1982) Optimization Over Time: Dynamic Programming and Stochastic Control Vol. I New York: Wiley.
Google Scholar

Download references

Author information

Authors and Affiliations

New York University, USA
Rangarajan K. Sundaram

Authors

Rangarajan K. Sundaram
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Managerial Economics & Decision Sciences, Kellogg School of Management, Northwestern University, 2001 Sheridan Road, Evanston, IL, 60208-2009, USA
David Austen-Smith (Earl Dean Howard Distinguished Professor of Political Economy) (Earl Dean Howard Distinguished Professor of Political Economy)
W. Allen Wallis Institute of Political Economy, University of Rochester, Rochester, NY, 14627, USA
John Duggan (Associate Professor of Political Science and Economics) (Associate Professor of Political Science and Economics)

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Sundaram, R.K. (2005). Generalized Bandit Problems. In: Austen-Smith, D., Duggan, J. (eds) Social Choice and Strategic Decisions. Studies in Choice and Welfare. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-27295-X_6

Download citation

DOI: https://doi.org/10.1007/3-540-27295-X_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-22053-4
Online ISBN: 978-3-540-27295-3
eBook Packages: Business and EconomicsEconomics and Finance (R0)

Publish with us

Policies and ethics