Theory and Decision

, Volume 47, Issue 3, pp 267–295

Dynamic stochastic dominance in bandit decision problems

  • Thierry Magnac
  • Jean-Marc Robin
Article

Abstract

The aim of this paper is to study the monotonicity properties with respect to the probability distribution of the state processes, of optimal decisions in bandit decision problems. Orderings of dynamic discrete projects are provided by extending the notion of stochastic dominance to stochastic processes.

Stochastic dynamic programing Multi-armed bandit problems Stochastic dominance 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

REFERENCES

  1. Banks, J.S. and Sundaram, R.K. (1992), Denumerable armed bandit problems, Econometrica 60(5):1071–1096.Google Scholar
  2. Berry, D.A. and Fristedt, B. (1985), Bandit Problems: Sequential Allocation of Experiments, London: Chapman and Hall.Google Scholar
  3. Berry, D.A. and Kertz, R.P. (1991), Worth of perfect information in Bernoulli bandits, Adv.Appl.Prob.23: 1–23.Google Scholar
  4. Bikhchandani, S. and Sharma, S. (1990), Optimal search with learning, Working Paper no. 580, Department of Economics, University of California, Los Angeles.Google Scholar
  5. Bikhchandani, S., Segal, U. and Sharma, S. (1992), Stochastic dominance under Bayesian learning, Journal of Economic Theory 56(2):352–377.Google Scholar
  6. Billingsley, P. (1986), Probability and Measure, New York: Wiley.Google Scholar
  7. Blackwell, D. (1965), Discounted dynamic programming, Ann.Math.Statis.36: 226–35.Google Scholar
  8. DeGroot, M.H. (1970), Optimal Statistical Decisions, New York: McGraw-Hill Book.Google Scholar
  9. Fishman, A. (1990), Stochastic dominance in multi sampling environments, Journal of Economic Theory51: 77–91.Google Scholar
  10. Flinn, C. (1986), Wages and job mobility of young workers, Journal of Political Economy94: S88–S110.Google Scholar
  11. Gittins, J.C. (1989), Multi-armed Bandit Allocation Indices, New York: WileyGoogle Scholar
  12. Gittins, J.C. and Jones, D.M. (1974), A dynamic allocation index for sequential design of experiments, in J. Gani (ed.), Progress in Statistics, Amsterdam: North-Holland, pp. 241–266.Google Scholar
  13. Gittins, J.C. and Wang, Y.-G. (1992), The learning component of allocation indices, Annals of Statistics 20(3):1625–1636.Google Scholar
  14. Hadar, J. and Russel, W.R. (1969), Rules for ordering uncertain prospects, The American Economic Review59: 25–34.Google Scholar
  15. Jovanovic, B. (1979), Job-matching and the theory of turnover, Journal of Political Economy87: 972–990.Google Scholar
  16. Kamae, T., Krengel, U. and O'Brien, G.L. (1977), Stochastic inequalities on partially ordered spaces, The Annals of Probability5: 899–912.Google Scholar
  17. Milgrom, P.R. and Weber, R.J. (1982), A theory of auctions and competitive bidding, Econometrica50: 1089–1122.Google Scholar
  18. Miller, R. (1984), Job matching and occupational choice, Journal of Political Economy92: 1086–1120.Google Scholar
  19. Rothschild, M. and Stiglitz, J.E. (1970), Increasing risk, I: A definition, Journal of Economic Theory2: 225–243.Google Scholar
  20. Rothschild, M. (1974), A two-armed bandit theory of market pricing, Journal of Economic Theory9: 185–202.Google Scholar
  21. Russel, W.R. and Seo, T.K. (1989), Representative sets for stochastic dominance rules, in T.B. Fomby and T.K. Seo (eds.), Studies in the Economics of Uncertainty, Berlin: Springer Verlag.Google Scholar
  22. Whittle, P. (1982), Optimization over Time, 2 vols, New York: Wiley.Google Scholar

Copyright information

© Kluwer Academic Publishers 1999

Authors and Affiliations

  • Thierry Magnac
    • 1
  • Jean-Marc Robin
    • 1
  1. 1.INRA-LEA, École Normale supérieureParisFrance. Phone

Personalised recommendations