One-armed bandit problem for parallel data processing systems

Kolnogorov, A. V.

doi:10.1134/S0032946015020088

One-armed bandit problem for parallel data processing systems

Large Systems
Published: 07 July 2015

Volume 51, pages 177–191, (2015)
Cite this article

Problems of Information Transmission Aims and scope Submit manuscript

A. V. Kolnogorov¹

51 Accesses
2 Citations
Explore all metrics

Abstract

We consider the minimax setting for the one-armed bandit problem, i.e., for the two-armed bandit problem with a known distribution function of incomes corresponding to the first action. Incomes that correspond to the second action have normal distribution functions with unit variance and an unknown mathematical expectation. According to the main theorem of game theory, the minimax strategy and minimax risk are sought for as Bayesian, corresponding to the worst-case prior distribution. Results can be applied to parallel data processing systems if there are two processing methods available with an a priori known efficiency of the first.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The Stochastic Multi-Armed Bandit Problem

Gaussian Two-Armed Bandit and Optimization of Batch Data Processing

Article 01 January 2018

Two-Armed Bandit Problem and Batch Version of the Mirror Descent Algorithm

Article 01 August 2022

References

Berry, D.A. and Fristedt, B., Bandit Problems: Sequential Allocation of Experiments, London: Chapman & Hall, 1985.
Book Google Scholar
Presman, E.L. and Sonin, I.M., Posledovatel’noe upravlenie po nepolnym dannym. Baiesovskii podkhod (Sequential Control Based on Incomplete Data: Bayesian Approach), Moscow: Nauka, 1982.
Google Scholar
Tsetlin, M.L., Issledovaniya po teorii avtomatov i modelirovaniyu biologicheskikh sistem, Moscow: Nauka, 1969. Translated under the title Automaton Theory and Modeling of Biological Systems, New York: Academic, 1973.
MATH Google Scholar
Varshavsky, V.I., Kollektivnoe povedenie avtomatov (Collective Behavior of Automata), Moscow: Nauka, 1973. Translated under the title Kollektives Verhalten von Automaten, Warschawski, W.I., Berlin: Akademie, 1978.
Google Scholar
Sragovich, V.G., Adaptivnoe upravlenie (Adaptive Control), Moscow: Nauka, 1981. Translated under the title Mathematical Theory of Adaptive Control, Singapore: World Sci., 2006.
Google Scholar
Nazin, A.V. and Poznyak, A.S., Adaptivnyi vybor variantov: rekurrentnye algoritmy (Adaptive Choice: Recursive Algorithms), Moscow: Nauka, 1986.
Google Scholar
Robbins, H., Some Aspects of the Sequential Design of Experiments, Bull. Amer. Math. Soc., 1952, vol. 58, no. 5, pp. 527–535.
Article Google Scholar
Vogel, W., An Asymptotic Minimax Theorem for the Two Armed Bandit Problem, Ann. Math. Stat., 1960, vol. 31, no. 2, pp. 444–451.
Article Google Scholar
Juditsky, A., Nazin, A.V., Tsybakov, A.B., and Vayatis, N., Gap-Free Bounds for Stochastic Multi-Armed Bandit, in Proc. 17th IFAC World Congr., Seoul, Korea, July 6–11, 2008, pp. 11560–11563. Available at http://www.ifac-papersonline.net/Detailed/37644.html.
Lai, T.L., Levin, B., Robbins, H., and Siegmund, D., Sequential Medical Trials, Proc. Natl. Acad. Sci. USA, 1980, vol. 77, no. 6, Part 1, pp. 3135–3138.
Article MathSciNet Google Scholar
Kolnogorov, A.V., Two-Armed Bandit Problem for Parallel Data Processing Systems, Probl. Peredachi Inf., 2012, vol. 48, no. 1, pp. 83–95 [Probl. Inf. Trans. (Engl. Transl.), 2012, vol. 48, no. 1, pp. 72–84].
Google Scholar
Kolnogorov, A.V., Determination of Minimax Strategies and Risk in a Random Environment (the Two-Armed Bandit Problem), Avtomat. i Telemekh., 2011, no. 5, pp. 127–138 [Autom. Remote Control (Engl. Transl.), 2011, vol. 72, no. 5, pp. 1017–1027].
Google Scholar
Bradt, R.N., Johnson, S.M., and Karlin, S., On Sequential Designs for Maximizing the Sum of n Observations, Ann. Math. Statist., 1956, vol. 27, no. 4, pp. 1060–1074.
Article MathSciNet Google Scholar
Chernoff, H. and Ray, S.N., A Bayes Sequential Sampling Inspection Plan, Ann. Math. Statist., 1965, vol. 36, no. 5, pp. 1387–1407.
Article MathSciNet Google Scholar
Kolnogorov, A.V., Determination of the Minimax Risk for the Normal Two-Armed Bandit, in Proc. 10th IFAC Workshop on the Adaptation and Learning in Control and Signal Processing (ALCOSP’2010), Antalya, Turkey, Aug. 26–28, 2010, pp. 231–236. Available at http://www.ifac-papersonline.net/Detailed/46787.html.

Download references

Author information

Authors and Affiliations

Applied Mathematics and Information Science Department, Yaroslav-the-Wise Novgorod State University, Yaroslav, Russia
A. V. Kolnogorov

Authors

A. V. Kolnogorov
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to A. V. Kolnogorov.

Additional information

Original Russian Text © A.V. Kolnogorov, 2015, published in Problemy Peredachi Informatsii, 2015, Vol. 51, No. 2, pp. 99–113.

Supported in part by the Russian Foundation for Basic Research, project no. 13-01-00334-a, and the Project Part of the State Assignment in the Field of Scientific Activity by the Ministry of Education and Science of the Russian Federation, project no. 1.949.2014/K.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kolnogorov, A.V. One-armed bandit problem for parallel data processing systems. Probl Inf Transm 51, 177–191 (2015). https://doi.org/10.1134/S0032946015020088

Download citation

Received: 02 September 2014
Accepted: 25 February 2015
Published: 07 July 2015
Issue Date: April 2015
DOI: https://doi.org/10.1134/S0032946015020088

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

One-armed bandit problem for parallel data processing systems

Abstract

Access this article

Similar content being viewed by others

The Stochastic Multi-Armed Bandit Problem

Gaussian Two-Armed Bandit and Optimization of Batch Data Processing

Two-Armed Bandit Problem and Batch Version of the Mirror Descent Algorithm

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

One-armed bandit problem for parallel data processing systems

Abstract

Access this article

Similar content being viewed by others

The Stochastic Multi-Armed Bandit Problem

Gaussian Two-Armed Bandit and Optimization of Batch Data Processing

Two-Armed Bandit Problem and Batch Version of the Mirror Descent Algorithm

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation