Abstract
Federated learning refers to the task of machine learning based on decentralized data from multiple clients with secured data privacy. Recent studies show that quantum algorithms can be exploited to boost its performance. However, when the clients’ data are not independent and identically distributed (IID), the performance of conventional federated algorithms is known to deteriorate. In this work, we explore the non-IID issue in quantum federated learning with both theoretical and numerical analysis. We further prove that a global quantum channel can be exactly decomposed into local channels trained by each client with the help of local density estimators. This observation leads to a general framework for quantum federated learning on non-IID data with one-shot communication complexity. Numerical simulations show that the proposed algorithm outperforms the conventional ones significantly under non-IID settings.
Similar content being viewed by others
Data availability
All the data and materials used in this work can be accessed at https://github.com/JasonZHM/quantum-fed-infer.
Notes
In a typical classification problem, k denotes the different classes and the standard cross entropy loss is given by \(f_{k}(w, |\psi \rangle ) = \log r_{k} (w, |\psi \rangle )\), where rk is the predicted probability of |ψ〉 belonging to class k.
The fact that identical samples have identical labels guarantees that the global channel \({\mathscr{M}}\) is well defined. For example, suppose we have two identical samples \(|\psi ^{C_{i}}\rangle =|\psi ^{C_{j}}\rangle =|\psi \rangle \) with the same label y, but are from different clients Ci≠Cj. Then, in order to fulfill the local minimization problems, Eq. (6), we must have \({\mathscr{M}}_{i}(|\psi \rangle \langle \psi |)={\mathscr{M}}_{j}(|\psi \rangle \langle \psi |)\).
To avoid confusion, we insist on using different notations for \(P_{x}^{\psi } = \sigma _{x}^{\psi } = |\psi \rangle \langle \psi |\) to emphasize their differences in physical meaning: we use \(\sigma _{x}^{\psi }\) to denote the quantum state that you can load into your circuit, while we use \(P_{x}^{\psi }\) to denote the projection operator.
References
Barenco A, Bennett CH, Cleve R, DiVincenzo DP, Margolus N, Shor P, Sleator T, Smolin JA, Weinfurter H (1995) Elementary gates for quantum computation. Phys Review A 52:3457
Biamonte J, Wittek P, Pancotti N, Rebentrost P, Wiebe N, Lloyd S (2017) Quantum machine learning. Nature 549:195
Bishop CM (2006) Pattern recognition and machine learning (information science and statistics). Springer-Verlag, Berlin
Bradbury J, Frostig R, Hawkins P, Johnson MJ, Leary C, Maclaurin D, Necula G, Paszke A, VanderPlas J, Wanderman-Milne S, Zhang Q (2018) JAX: composable transformations of Python+NumPy programs
Broadbent A, Fitzsimons J, Kashefi E (2009) Universal blind quantum computation. In: 2009 50th annual IEEE symposium on foundations of computer science, IEEE, pp 517–526
Chen SY-C, Yoo S (2021) Federated quantum machine learning. Entropy 23:460
Chehimi M, Saad W (2022) Quantum federated learning with quantum data. In: ICASSP 2022-2022 IEEE international conference on acoustics, speech and signal processing (ICASSP), IEEE, pp 8617–8621
Das Sarma S, Deng D-L, Duan L-M (2019) Machine learning meets quantum physics. Phys Today 72:48
Gao X, Zhang Z-Y, Duan L-M (2018) A quantum machine learning algorithm based on generative models. Sci Adv 4:eaat9004
Geiping J, Bauermeister H, Dröge H, Moeller M (2020) Inverting gradients-how easy is it to break privacy in federated learning? Adv Neural Inf Process Syst 33:16937
Giovannetti V, Lloyd S, Maccone L (2008a) Quantum random access memory. Phys Rev Lett 100:160501
Giovannetti V, Lloyd S, Maccone L (2008b) Architectures for a quantum random access memory. Phys Rev A 78:052310
González FA, Vargas-Calderón V, Vinck-Posada H (2021) Classification with quantum measurements. J Phys Soc Jpn 90:044002. https://doi.org/10.7566/JPSJ.90.044002
González FA, Gallego A, Toledo-Cortés S, Vargas-Calderón V (2022) Learning with density matrices and random features. Quantum Mach Intell 4:1
Goodfellow I, Bengio Y, Courville A (2016) Deep Learning. (MIT Press). http://www.deeplearningbook.org
Guha N, Talwalkar A, Smith V (2019) One-shot federated learning. arXiv:1902.11175
Harrow AW, Hassidim A, Lloyd S (2009) Quantum algorithm for linear systems of equations. Phys Rev Lett 103:150502
Havlíček V, Córcoles AD, Temme K, Harrow AW, Kandala A, Chow JM, Gambetta JM (2019) Supervised learning with quantum-enhanced feature spaces. Nature 567:209
Hsieh K, Phanishayee A, Mutlu O, Gibbons P (2020) The non-iid data quagmire of decentralized machine learning. In: International conference on machine learning, PMLR, pp 4387–4398
Huang H-Y, Kueng R, Torlai G, Albert VV, Preskill J (2022) Provably efficient machine learning for quantum many-body problems. Science 377:eabk3333
Huang H-Y, Kueng R, Preskill J (2020) Predicting many properties of a quantum system from very few measurements. Nature Phys 16:1050
Jacobs RA, Jordan MI, Nowlan SJ, Hinton GE (1991) Adaptive mixtures of local experts. Neural Comput 3:79
Jordan MI, Jacobs RA (1994) Hierarchical mixtures of experts and the em algorithm. Neural Comput 6:181
Kasturi A, Ellore AR, Hota C (2020) Fusion learning: a one shot federated learning. In: International conference on computational science, Springer, pp 424–436
Khraisat A, Alazab A (2021) A critical review of intrusion detection systems in the internet of things: techniques, deployment strategy, validation strategy, attacks, public datasets and challenges. Cybersecurity 4:1
Kingma DP, Ba J (2017) Adam: A method for stochastic optimization. arXiv:1412.6980
Konečnỳ J, McMahan HB, Yu FX, Richtárik P, Suresh AT, Bacon D (2016) Federated learning: strategies for improving communication efficiency. arXiv:1610.05492
LaRose R, Tikku A, O’Neel-Judy É, Cincio L, Coles PJ (2019) Variational quantum state diagonalization. Quantum Inf 5:1
Li H-S, Zhu Q, Li M-C, Ian H, et al. (2014) Multidimensional color image storage, retrieval, and compression based on quantum amplitudes and phases. Inf Sci 273:212
Li J, Yang X, Peng X, Sun C-P (2017) Hybrid quantum-classical approach to quantum optimal control. Phys Rev Lett 118:150503
Li W, Deng D-L (2021) Recent advances for quantum classifiers. Sci China Phys Mech Astron 65
Li W, Lu S, Deng D-L (2021a) Quantum federated learning through blind quantum computing. Sci China Phys Mech Astron 64:1
Li H-S, Fan P, Peng H, Song S, Long G-L (2021b) Multilevel 2-d quantum wavelet transforms. IEEE Transactions on Cybernetics
Liu J-G, Wang L (2018) Differentiable learning of quantum circuit born machines. Phys Rev A 98:062324
Liu J, Tang Y, Zhao H, Wang X, Li F, Zhang J (2022) Cps attack detection under limited local information in cyber security: a multi-node multi-class classification ensemble approach. arXiv:2209.00170
Lloyd S, Mohseni M, Rebentrost P (2014) Quantum principal component analysis. Nature Phys 10:631
Lloyd S, Weedbrook C (2018) Quantum generative adversarial learning. Phys Rev Lett 121:040502
Long G-L, Sun Y (2001) Efficient scheme for initializing a quantum register with an arbitrary superposed state. Phys Rev A 64:014303
Masoudnia S, Ebrahimpour R (2014) Mixture of experts: a literature survey. Artif Intell Rev 42:275
McMahan B, Moore E, Ramage D, Hampson S, Arcas BAy (2017) Communication-efficient learning of deep networks from decentralized data. In: Singh A, Zhu J (eds) Proceedings of the 20th international conference on artificial intelligence and statistics, proceedings of machine learning research, vol 54. PMLR, pp. 1273–1282
McClean JR, Boixo S, Smelyanskiy VN, Babbush R, Neven H (2018) Barren plateaus in quantum neural network training landscapes. Nature Commun 9:1
Mitarai K, Negoro M, Kitagawa M, Fujii K (2018) Quantum circuit learning. Phys Rev A 98:032309
Nielsen MA, Chuang IL (2010) Quantum Computation and Quantum Information. (Cambridge University Press)
Plesch M, Brukner Č (2011) Quantum-state preparation with universal gate decompositions. Phys Rev A 83:032302
Rebentrost P, Mohseni M, Lloyd S (2014) Quantum support vector machine for big data classification. Phys Rev Lett 113:130503
Rezende D, Mohamed S (2015) Variational inference with normalizing flows. In: International conference on machine learning, PMLR, pp 1530–1538
Rieke N, Hancox J, Li W, Milletari F, Roth HR, Albarqouni S, Bakas S, Galtier MN, Landman BA, Maier-Hein K, et al. (2020) The future of digital health with federated learning. NPJ Digit Med 3:1
Rubner Y, Tomasi C, Guibas LJ (2000) The earth mover’s distance as a metric for image retrieval. Int J Comput Vis 40:99
Sagi O, Rokach L (2018) Ensemble learning: a survey. Wiley Interdiscip Rev Data Min Knowl Discov 8:e1249
Salehkaleybar S, Sharif-Nassab A, Golestani SJ (2021) One-shot federated learning: theoretical limits and algorithms to achieve them. J Mach Learn Res 22:189
Schuld M, Killoran N (2019) Quantum machine learning in feature hilbert spaces. Phys Rev Lett 122:040504
Swart JM (2020) Introduction to quantum probability. http://staff.utia.cas.cz/swart/lecture_notes/qua20_04_27.pdf. Accessed: July 1 2022
Xia Q, Li Q (2021) Quantumfed: a federated learning framework for collaborative quantum training. In: 2021 IEEE global communications conference (GLOBECOM). IEEE, pp 1–6
Xin T, Che L, Xi C, Singh A, Nie X, Li J, Dong Y, Lu D (2021) Experimental quantum principal component analysis via parametrized quantum circuits. Phys Rev Lett 126:110502
Yun WJ, Kim JP, Jung S, Park J, Bennis M, Kim J (2022) Slimmable quantum federated learning. arXiv:2207.10221
Zhang S-X, Allcock J, Wan Z-Q, Liu S, Sun J, Yu H, Yang X-H, Qiu J, Ye Z, Chen Y-Q, et al. (2022) Tensorcircuit: a quantum software framework for the nisq era. arXiv:2205.10091
Zhao Y, Li M, Lai L, Suda N, Civin D, Chandra V (2018) Federated learning with non-iid data. arXiv:1806.00582
Zhou Y, Pu G, Ma X, Li X, Wu D (2020) Distilled one-shot federated learning. arXiv:2009.07999
Zhou Z-H (2021) Machine learning. Springer Nature
Zhu L, Liu Z, Han S (2019) Deep leakage from gradients. Adv Neural Inf Process Syst 32
Acknowledgments
We thank Weikang Li, Jingyi Zhang, Yuxuan Yan, Rebing Wu, and Yuchen Guo for their insightful discussions. We thank the anonymous reviewers for their constructive suggestions on the manuscript. We also acknowledge the Tsinghua Astrophysics High-Performance Computing platform for providing computational and data storage resources. This work is financially supported by Zhili College, Tsinghua University.
Funding
This work is financially supported by Zhili College, Tsinghua University.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Competing interests
The author declares no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix A. Proof of Proposition 1
Proposition 1 is a quantum generalization of its classical counterpart, Proposition 3.1 in Zhao et al. (2018). Below, we provide a detailed proof following the ideas introduced in Zhao et al. (2018). Based on the definition of Δ and the update rules, Eqs. (2), (3) and (4), we have
Now we apply the triangle inequality and the Lipschitz conditions. Together with the definition \(a_{i} = 1 +\eta {\sum }_{k} p^{(i)}(y=k)\lambda _{k}\), we have
Then we continue going backwards in the time steps. With the triangle inequality and the definitions of g(w) and EMDi, we have
By induction and the broadcast rule \(w^{i}_{(m-1)T} = w^{(f)}_{(m-1)T}\), we arrive at
Plug it into Eq. (14), and we finally reach the desired result:
Appendix B. Proof of Theorem 1
With the definitions in Sections 2.1 and 2.4, for any pure input state \(\sigma _{x}^{\psi } = |\psi \rangle \langle \psi |\), the global channel \({\mathscr{M}}\) can be decomposed into
where the second line utilizes the fact that ρ is diagonal in |Ci〉: \(\rho = {\sum }_{i} P_{C_{i}} \rho P_{C_{i}}\), and the last equality follows from
As for mixed states, we note that they can always be decomposed into a linear combination of pure states. Thus following from the linearity of quantum channels, the formula for \({\mathscr{M}}\) acting on mixed states follows from direct linear superposition. This completes the proof of Theorem 1.
Appendix C. A proposal of quantum generative learning with qFedInf
We mentioned in the main text that the proposed framework qFedInf may be applied to machine learning tasks beyond classification. Here we provide a specific example of performing quantum generative learning (Gao et al. 2018; Lloyd and Weedbrook 2018; Liu and Wang 2018) with qFedInf. This only serves as a preliminary proposal and we leave the detailed study of its performance and implications to future works.
In quantum generative learning, we aim to learn a generative model that can reconstruct some target quantum state ρx. In a federated learning context, each client Ci only has access to a small proportion of the total data, which statistically forms a quantum state \(\rho _{x}^{C_{i}}\). Thus the whole target state can be written as \(\rho _{x} = {\sum }_{i} p_{C_{i}} \rho _{x}^{C_{i}}\), where \(p_{C_{i}}\) is the proportion of data accessible to client Ci. The notations here are the same as in Section 2.1.
We take the quantum generative adversarial network (qGAN) (Lloyd and Weedbrook 2018) as our quantum channel \({\mathscr{M}}\) to perform the learning task. It’s a quantum circuit that takes some fixed initial state, for example |0⋯0〉, as its input, and outputs a quantum state, which is parameterized by the circuit parameters. Adversarial learning strategies are applied to train the circuit and the output state after training is expected to approximate the target state. Here we omit the training details as our focus is on the federated learning aspect.
In the qFedInf framework, each client trains its own qGAN, denoted as the local channel \({\mathscr{M}}_{i}\), with its own data \(\rho _{x}^{C_{i}}\). After training, we expect \({\mathscr{M}}_{i}(|0\cdots 0\rangle \langle 0\cdots 0|)\approx \rho _{x}^{C_{i}}\). As for the density estimation part, we note that the input states are fixed to be |0⋯0〉, so the density estimators become trivial, i.e., \(\mathcal {D}_{i}(|0\cdots 0\rangle )=1\). Plug these into Eq. (11) and we arrive at
which is exactly our goal. This is a concrete proposal of quantum federated generative learning which has not appeared in the literature so far.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Zhao, H. Non-IID quantum federated learning with one-shot communication complexity. Quantum Mach. Intell. 5, 3 (2023). https://doi.org/10.1007/s42484-022-00091-z
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s42484-022-00091-z