On the two-armed bandit problem with non-observed Poissonian switching of arms

Donchev, Doncho S.

doi:10.1007/BF01198403

On the two-armed bandit problem with non-observed Poissonian switching of arms

Published: October 1998

Volume 47, pages 401–422, (1998)
Cite this article

Mathematical Methods of Operations Research Aims and scope Submit manuscript

Doncho S. Donchev¹

73 Accesses
1 Citation
Explore all metrics

Abstract

We consider the symmetrical case of the Poissonian variant of the two-armed bandit problem. We suppose that at jump-times of a non-observed Poisson process sequential switching of the bandit arms occurs. The optimality of the myopic policy is proved for all sufficiently large switching rates. In this case, explicit formulas which are good approximation for the value function are found as well.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The non-stationary stochastic multi-armed bandit problem

Article 30 March 2017

Infinitely Many-Armed Bandits with Unknown Value Distribution

Robust Risk-Averse Stochastic Multi-armed Bandits

References

Bertsekas D, Shreve S (1978) Stochastic optimal control: The discrete time case. Academic Press, New York
Google Scholar
DeGroot M (1970) Optimal statistical decisions. McGraw-Hill
Donchev D (1998) A system of functional-differential equations associated with the problem of optimal detection of jump-times of a Poisson process. JAMSA 11, to appear
Donchev D, Yushkevich A (1996) Average optimality in a Poisson bandit with switching arms. Mathematical Methods of Operations Research 45:265–280
Google Scholar
Dynkin E, Yushkevich A (1979) Controlled Markov Processes. Springer-Verlag, New York, Berlin
Google Scholar
Feldman D (1962) Contribution to the “two-armed bandit” problem. Ann Math Stat 33:847–856
Google Scholar
Presman E (1990) A Poisson version of the two-armed bandit problem with discounting. Theory Prob Appl 35:307–317
Google Scholar
Presman E, Sonin I (1990) Sequential control with incomplete data: Bayesian approach. Academic Press, New York (Russian edition 1982)
Google Scholar
Sonin I (1976) A model of resource distribution with incomplete information. In: Modelling Scientific-Technological Progress and the Cotrol of Economic Processes Under Incomplete Information. CEMI, USSR Academy of Sciences Press Moscow, pp. 161–201 (in Russian)
Yushkevich A (1988) On the two-armed bandit problem with continuous time parameter and discounted reward. Stochastics 23:399–410
Google Scholar
Yushkevich A (1989) Personal discussion. Summer school on probability, Chernovci, Ukraine
Google Scholar
Yushkevich A (1989) Verification theorems for Markov decision processes with controlled deterministic drift and gradual and impulsive controls. Theory Prob Appl 34:474–496
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Mathematics, Higher Institute of Food and Flavor Industries, 26, Maritza str., 4002, Plovdiv, Bulgaria
Doncho S. Donchev

Authors

Doncho S. Donchev
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Donchev, D.S. On the two-armed bandit problem with non-observed Poissonian switching of arms. Mathematical Methods of Operations Research 47, 401–422 (1998). https://doi.org/10.1007/BF01198403

Download citation

Revised: 15 December 1996
Issue Date: October 1998
DOI: https://doi.org/10.1007/BF01198403

Key words

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

On the two-armed bandit problem with non-observed Poissonian switching of arms

Abstract

Access this article

Similar content being viewed by others

The non-stationary stochastic multi-armed bandit problem

Infinitely Many-Armed Bandits with Unknown Value Distribution

Robust Risk-Averse Stochastic Multi-armed Bandits

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Key words

Navigation

On the two-armed bandit problem with non-observed Poissonian switching of arms

Abstract

Access this article

Similar content being viewed by others

The non-stationary stochastic multi-armed bandit problem

Infinitely Many-Armed Bandits with Unknown Value Distribution

Robust Risk-Averse Stochastic Multi-armed Bandits

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Key words

Search

Navigation