Skip to main content

Quantum bandits


We consider the quantum version of the bandit problem known as best arm identification (BAI). We first propose a quantum modeling of the BAI problem, which assumes that both the learning agent and the environment are quantum; we then propose an algorithm based on quantum amplitude amplification to solve BAI. We formally analyze the behavior of the algorithm on all instances of the problem and we show, in particular, that it is able to get the optimal solution quadratically faster than what is known to hold in the classical case.

This is a preview of subscription content, access via your institution.


  1. Aïmeur E, Brassard G, Gambs S (2006) Machine learning in a quantum world. In: Conference of the Canadian Society for Computational Studies of Intelligence, pp 431–442

  2. Aïmeur E, Brassard G, Gambs S (2013) Quantum speed-up for unsupervised learning. Mach Learn 90(2):261–287

    MathSciNet  MATH  Article  Google Scholar 

  3. Audibert J-Y, Bubeck S (2010) Best arm identification in multi-armed bandits. In: COLT—23th conference on learning theory - 2010, p 13, Haifa

  4. Auer P, Cesa-Bianchi N, Fischer P (2002) Finite-time analysis of the multiarmed bandit problem. Mach Learn 47(2):235–256

    MATH  Article  Google Scholar 

  5. Bertsekas D P, Tsitsiklis JN (1996) Neuro-dynamic programming. Athena Scientific, Belmont

    MATH  Google Scholar 

  6. Biamonte J, Wittek P, Pancotti N, Rebentrost P, Wiebe N, Lloyd S (2017) Quantum machine learning. Nature 549(7671):195–202

    Article  Google Scholar 

  7. Brassard G, Hoyer P, Mosca M, Tapp A (2000) Quantum amplitude amplification and estimation. arXiv:art.quant-ph/0005055

  8. Childs AM, Goldstone J (2004) Spatial search by quantum walk. Phys Rev A 70(2):022314

    Article  Google Scholar 

  9. Ciliberto C, Herbster M, Ialongo A D, Pontil M, Rocchetto A, Severini S, Wossnig L (2018) Quantum machine learning: a classical perspective. Proc R Soc A: Math Phys Eng Sci 474 (2209):20170551

    MathSciNet  MATH  Article  Google Scholar 

  10. Dong D, Chen C, Li H, Tarn T -J (2008) Quantum reinforcement learning. IEEE Trans Syst Man Cybern Part B (Cybern) 38(5):1207–1220

    Article  Google Scholar 

  11. Dunjko V, Briegel HJ (2018) Machine learning & artificial intelligence in the quantum domain: a review of recent progress. Rep Prog Phys 81(7):074001

    MathSciNet  Article  Google Scholar 

  12. Dunjko V, Taylor JM, Briegel HJ (2016) Quantum-enhanced machine learning. Phys Rev Lett 117(13):130501

    MathSciNet  Article  Google Scholar 

  13. Grover L K (1996) A fast quantum mechanical algorithm for database search. In: Proceedings of the twenty-eighth annual ACM symposium on theory of computing, pp 212–219

  14. Grover L K (1998) Quantum computers can search rapidly by using almost any transformation. Phys Rev Lett 80(19):4329

    Article  Google Scholar 

  15. Hoeffding W (1963) Probability inequalities for sums of bounded random variables. J Am Stat Assoc 58(301):13–30

    MathSciNet  MATH  Article  Google Scholar 

  16. Kaelbling L P, Littman M L, Moore A P (1996) Reinforcement learning: a survey. J Artif Intell Res 4:237–285

    Article  Google Scholar 

  17. Kapoor A, Wiebe N, Svore K (2016) Quantum perceptron models. In: Advances in neural information processing systems, pp 3999–4007

  18. Kerenidis I, Prakash A (2017) “Quantum Recommendation Systems”. 8th Innovations in Theoretical Computer Science Conference (ITCS 2017). Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik

  19. Lai T L, Robbins H (1985) Asymptotically efficient adaptive allocation rules. Adv Appl Math 6(1):4–22

    MathSciNet  MATH  Article  Google Scholar 

  20. Lamata L (2017) Basic protocols in quantum reinforcement learning with superconducting circuits. Sci Rep 7(1):1–10

    Article  Google Scholar 

  21. Lattimore T, Szepesvári C (2020) Bandit algorithms. Cambridge University Press, Cambridge

    MATH  Book  Google Scholar 

  22. Levine S, Finn C, Darrell T, Abbeel P (2016) End-to-end training of deep visuomotor policies. J Mach Learn Res 17(1):1334–1373

    MathSciNet  MATH  Google Scholar 

  23. Mnih V, Kavukcuoglu K, Silver D, Rusu A A, Veness J, Bellemare M G, Graves A, Riedmiller M, Fidjeland A K, Ostrovski G et al (2015) Human-level control through deep reinforcement learning. Nature 518(7540):529–533

    Article  Google Scholar 

  24. Naruse M, Berthel M, Drezet A, Huant S, Aono M, Hori H, Kim S -J (2015) Single-photon decision maker. Sci Rep 5(1):1–9

    Article  Google Scholar 

  25. Rebentrost P, Mohseni M, Lloyd S (2014) Quantum support vector machine for big data classification. Phys Rev Lett 113(13):130503

    Article  Google Scholar 

  26. Roget M, Guillet S, Arrighi P, Di Molfetta G (2020) Grover search as a naturally occurring phenomenon. Phys Rev Lett 124(18):180501

    MathSciNet  Article  Google Scholar 

  27. Schuld M, Petruccione F (2018) Supervised learning with quantum computers, vol 17. Springer, Berlin

    MATH  Book  Google Scholar 

  28. Sutton R S, Barto AG (2018) Reinforcement learning: an introduction, 2nd edn. The MIT Press, Cambridge

    MATH  Google Scholar 

  29. Thompson WR (1933) On the likelihood that one unknown probability distribution exceeds another in view of the evidence ot two samples. Biometrika 25(3–4):285–294, 12

    Article  Google Scholar 

  30. Wittek P (2014) Quantum machine learning: what quantum computing means to data mining. Academic Press, New York

    MATH  Google Scholar 

Download references


This work has been funded by the French National Research Agency (ANR) project QuantML (grant number ANR-19-CE23-0011), the Pépinière d’Excellence 2018, AMIDEX foundation, project DiTiQuS, and the ID #60609 grant from the John Templeton Foundation, as part of the “The Quantum Information Structure of Spacetime (QISS)” Project.

Author information



Corresponding author

Correspondence to Balthazar Casalé.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Casalé, B., Di Molfetta, G., Kadri, H. et al. Quantum bandits. Quantum Mach. Intell. 2, 11 (2020).

Download citation


  • Bandits
  • Best arm identification
  • Quantum amplitude amplification