We consider the quantum version of the bandit problem known as best arm identification (BAI). We first propose a quantum modeling of the BAI problem, which assumes that both the learning agent and the environment are quantum; we then propose an algorithm based on quantum amplitude amplification to solve BAI. We formally analyze the behavior of the algorithm on all instances of the problem and we show, in particular, that it is able to get the optimal solution quadratically faster than what is known to hold in the classical case.
This is a preview of subscription content, access via your institution.
Buy single article
Instant access to the full article PDF.
Tax calculation will be finalised during checkout.
Aïmeur E, Brassard G, Gambs S (2006) Machine learning in a quantum world. In: Conference of the Canadian Society for Computational Studies of Intelligence, pp 431–442
Aïmeur E, Brassard G, Gambs S (2013) Quantum speed-up for unsupervised learning. Mach Learn 90(2):261–287
Audibert J-Y, Bubeck S (2010) Best arm identification in multi-armed bandits. In: COLT—23th conference on learning theory - 2010, p 13, Haifa
Auer P, Cesa-Bianchi N, Fischer P (2002) Finite-time analysis of the multiarmed bandit problem. Mach Learn 47(2):235–256
Bertsekas D P, Tsitsiklis JN (1996) Neuro-dynamic programming. Athena Scientific, Belmont
Biamonte J, Wittek P, Pancotti N, Rebentrost P, Wiebe N, Lloyd S (2017) Quantum machine learning. Nature 549(7671):195–202
Brassard G, Hoyer P, Mosca M, Tapp A (2000) Quantum amplitude amplification and estimation. arXiv:art.quant-ph/0005055
Childs AM, Goldstone J (2004) Spatial search by quantum walk. Phys Rev A 70(2):022314
Ciliberto C, Herbster M, Ialongo A D, Pontil M, Rocchetto A, Severini S, Wossnig L (2018) Quantum machine learning: a classical perspective. Proc R Soc A: Math Phys Eng Sci 474 (2209):20170551
Dong D, Chen C, Li H, Tarn T -J (2008) Quantum reinforcement learning. IEEE Trans Syst Man Cybern Part B (Cybern) 38(5):1207–1220
Dunjko V, Briegel HJ (2018) Machine learning & artificial intelligence in the quantum domain: a review of recent progress. Rep Prog Phys 81(7):074001
Dunjko V, Taylor JM, Briegel HJ (2016) Quantum-enhanced machine learning. Phys Rev Lett 117(13):130501
Grover L K (1996) A fast quantum mechanical algorithm for database search. In: Proceedings of the twenty-eighth annual ACM symposium on theory of computing, pp 212–219
Grover L K (1998) Quantum computers can search rapidly by using almost any transformation. Phys Rev Lett 80(19):4329
Hoeffding W (1963) Probability inequalities for sums of bounded random variables. J Am Stat Assoc 58(301):13–30
Kaelbling L P, Littman M L, Moore A P (1996) Reinforcement learning: a survey. J Artif Intell Res 4:237–285
Kapoor A, Wiebe N, Svore K (2016) Quantum perceptron models. In: Advances in neural information processing systems, pp 3999–4007
Kerenidis I, Prakash A (2017) “Quantum Recommendation Systems”. 8th Innovations in Theoretical Computer Science Conference (ITCS 2017). Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik
Lai T L, Robbins H (1985) Asymptotically efficient adaptive allocation rules. Adv Appl Math 6(1):4–22
Lamata L (2017) Basic protocols in quantum reinforcement learning with superconducting circuits. Sci Rep 7(1):1–10
Lattimore T, Szepesvári C (2020) Bandit algorithms. Cambridge University Press, Cambridge
Levine S, Finn C, Darrell T, Abbeel P (2016) End-to-end training of deep visuomotor policies. J Mach Learn Res 17(1):1334–1373
Mnih V, Kavukcuoglu K, Silver D, Rusu A A, Veness J, Bellemare M G, Graves A, Riedmiller M, Fidjeland A K, Ostrovski G et al (2015) Human-level control through deep reinforcement learning. Nature 518(7540):529–533
Naruse M, Berthel M, Drezet A, Huant S, Aono M, Hori H, Kim S -J (2015) Single-photon decision maker. Sci Rep 5(1):1–9
Rebentrost P, Mohseni M, Lloyd S (2014) Quantum support vector machine for big data classification. Phys Rev Lett 113(13):130503
Roget M, Guillet S, Arrighi P, Di Molfetta G (2020) Grover search as a naturally occurring phenomenon. Phys Rev Lett 124(18):180501
Schuld M, Petruccione F (2018) Supervised learning with quantum computers, vol 17. Springer, Berlin
Sutton R S, Barto AG (2018) Reinforcement learning: an introduction, 2nd edn. The MIT Press, Cambridge
Thompson WR (1933) On the likelihood that one unknown probability distribution exceeds another in view of the evidence ot two samples. Biometrika 25(3–4):285–294, 12
Wittek P (2014) Quantum machine learning: what quantum computing means to data mining. Academic Press, New York
This work has been funded by the French National Research Agency (ANR) project QuantML (grant number ANR-19-CE23-0011), the Pépinière d’Excellence 2018, AMIDEX foundation, project DiTiQuS, and the ID #60609 grant from the John Templeton Foundation, as part of the “The Quantum Information Structure of Spacetime (QISS)” Project.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Casalé, B., Di Molfetta, G., Kadri, H. et al. Quantum bandits. Quantum Mach. Intell. 2, 11 (2020). https://doi.org/10.1007/s42484-020-00024-8
- Best arm identification
- Quantum amplitude amplification