Abstract
Quantum computation requires large classical datasets to be embedded into quantum states in order to exploit quantum parallelism. However, this embedding requires considerable resources in general. It would therefore be desirable to avoid it, if possible, for noisy intermediate-scale quantum (NISQ) implementation. Accordingly, we consider a classical-quantum hybrid architecture, which allows large classical input data, with a relatively small-scale quantum system. This hybrid architecture is used to implement a sampling oracle. It is shown that in the presence of noise in the hybrid oracle, the effects of internal noise can cancel each other out and thereby improve the query success rate. It is also shown that such an immunity of the hybrid oracle to noise directly and tangibly reduces the sample complexity in the framework of computational learning theory. This NISQ-compatible learning advantage is attributed to the oracle’s ability to handle large input features.
Similar content being viewed by others
Notes
Here, we consider a binary classification, i.e., mapping \(\{0,1\}^n \rightarrow \{0,1\}\).
Here we should clarify that such an oracle is employed for data-sampling, which it differs from those employed, for example, in the context of so-called amplitude amplification, where the oracle marks the relative phase on a single data state among superposed ones. In amplitude amplification studies, the primary objective is to reduce the number of iterations of the phase-marking oracle by using other incorporating modules [34].
The determination of its optimal (i.e., necessary and sufficient) condition is central and long-standing interest in computational learning theory, but this aspect is outside the scope of our study [37].
References
Shor, P.W.: Polynomial-time algorithms for prime factorization and discrete logarithms on a quantum computer. SIAM Rev. 41(2), 302–332 (1999)
Grover, L.K.: Quantum mechanics helps in searching for a needle in a haystack. Phys. Rev. Lett. 79, 325 (1997)
Harrow, A.W., Hassidim, A., Lloyd, S.: Quantum algorithm for linear systems of equations. Phys. Rev. Lett. 103, 150502 (2009)
Giovannetti, V., Lloyd, S., Maccone, L.: Quantum random access memory. Phys. Rev. Lett. 100, 160501 (2008)
Giovannetti, V., Lloyd, S., Maccone, L.: Architectures for a quantum random access memory. Phys. Rev. A 78, 052310 (2008)
Aaronson, S.: Read the find print. Nat. Phys. 11, 291 (2015)
Arunachalam, S., Gheorghiu, V., Jochym-O‘Connor, T., Mosca, M., Srinivasan, P.V.: On the robustness of bucket brigade quantum RAM. New J. Phys. 17, 123010 (2015)
Ciliberto, C., Herbster, M., Ialongo, A.D., Pontil, M., Rocchetto, A., Severini, S., Wossnig, L.: Quantum machine learning: a classical perspective. Proc. R. Soc. A 474, 20170551 (2018)
Preskill, J.: Quantum Computing in the NISQ era and beyond. Quantum 2, 79 (2018)
Arute, F., et al.: Quantum supremacy using a programmable superconducting processor. Nature 574, 505 (2019)
Peruzzo, A., McClean, J.R., Shadbolt, P., Yung, M.H., Zhou, X.Q., Love, P.J., Aspuru-Guzik, A., O‘brien, J.L.: A variational eigenvalue solver on a photonic quantum processor. Nat. Commun. 5, 4213 (2014)
McClean, J.R., Romerom, J., Babbush, R., Aspuru-Guzik, A.: The theory of variational hybrid quantum-classical algorithms. New J. Phys. 18, 023023 (2016)
Khoshaman, A., Vinci, W., Denis, B., Andriyash, E., Amin, M.H.: Quantum variational autoencoder. Quant. Sci. Technol. 4, 014001 (2018)
Zhu, D., et al.: Training of quantum circuits on a hybrid quantum computer. Sci. Adv. 5(10),(2019)
Havlíček, V., Córcoles, A.D., Temme, K., Harrow, A.W., Kandala, A., Chow, J.M., Gambetta, J.M.: Supervised learning with quantum-enhanced feature spaces. Nature 567, 209 (2019)
Yoo, S., Bang, J., Lee, C., Lee, J.: A quantum speedup in machine learning: finding a N-bit Boolean function for a classification. New J. Phys. 16, 103014 (2014)
Lee, J.S., Bang, J., Hong, S., Lee, C., Seol, K.H., Lee, J., Lee, K.G.: Experimental demonstration of quantum learning speedup with classical input data. Phys. Rev. A 99, 012313 (2019)
Bang, J., Dutta, A., Lee, S.W., Kim, J.: Optimal usage of quantum random access memory in quantum machine learning. Phys. Rev. A 99, 012326 (2019)
Dunjko, V., Ge, Y., Cirac, J.I.: Computational speedups using small quantum devices. Phys. Rev. Lett. 121, 250501 (2018)
Harrow, A.W.: Small quantum computers and large classical data sets. Preprint arXiv:2004.00026 (2020)
Buhrman, H., Newman, I., Rohrig, H., de Wolf, R.: Robust polynomials and quantum algorithms. Theory Comput. Syst. 40, 379 (2007)
Cross, A.W., Smith, G., Smolin, J.A.: Quantum learning robust against noise. Phys. Rev. A 92, 012327 (2015)
Valiant, L.G.: A theory of the learnable. Commun. ACM 27, 1134 (1984)
Langley, P.: Elements of Machine Learning. Morgan Kaufmann (1995)
Ambainis, A., Iwama, K., Kawachi, A., Masuda, H., Putra, R.H., Yamashita, S.: Quantum identification of Boolean oracles. In: Annual Symposium on Theoretical Aspects of Computer Science, Springer, pp. 105–116 (2004)
Childs, A.M., Kothari, R., Ozols, M., Roetteler, M.: Easy and hard functions for the Boolean hidden shift problem. In: 8th Conference on the Theory of Quantum Computation, Communication and Cryptography (TQC 2013) (Dagstuhl, Germany: Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik) vol. 22 of Leibniz International Proceedings in Informatics (LIPIcs) pp. 50–79 (2013)
Arunachalam, S., de Wolf, R.: Guest column: a survey of quantum learning theory. ACM SIGACT News 48, 41 (2017)
Gupta, P., Agrawal, A., Jha, N.K.: An algorithm for synthesis of reversible logic circuits. IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst. 25, 2317 (2006)
Angluin, D., Laird, P.: Queries and concept learning. Mach. Learn. 2(4), 319 (1988)
Angluin, D., Slonim, D.K.: Randomly fallible teachers: learning monotone DNF with an incomplete membership oracle. Mach. Learn. 14(1), 7 (1994)
Bshouty, N.H., Jackson, J.C.: Learning DNF over the uniform distribution using a quantum example oracle. SIAM J. Comput. 28, 1136 (1998)
Toffoli, T.: Reversible computing. International Colloquium on Automata, Languages, and Programming. Springer, pp. 632–644 (1980)
Younes, A., Miller, J.F.: Representation of Boolean quantum circuits as Reed-Muller expansions. Int. J. Electron. 91, 431 (2004)
Van Dam, W.: Quantum oracle interrogation: Getting all information for almost half the price. In: Proceedings 39th Annual Symposium on Foundations of Computer Science (Cat. No. 98CB36280) IEEE pp. 362–367 (1998)
Debnath, S., Linke, N.M., Figgatt, C., Landsman, K.A., Wright, K., Monroe, C.: Demonstration of a small programmable quantum computer with atomic qubits. Nature 536, 63 (2016)
Eleuch, H., Hilke, M., MacKenzie, R.: Probing Anderson localization using the dynamics of a qubit. Phys. Rev. A 95, 062114 (2017)
Hanneke, S.: The optimal sample complexity of PAC learning. J. Mach. Learn. Res. 17, 1319 (2016)
Acknowledgements
W.S., N.L., and J.B. are grateful to Gahyun Choi and Yonuk Chong for the valuable discussions on superconducting-qubit experiments. W.S., J.L., and J.B. acknowledge the financial support of the National Research Foundation of Korea (NRF) Grants (No. 2019R1A2C2005504, No. NRF-2019M3E4A1079666, and No. 2021M3E4A1038213), funded by the MSIP (Ministry of Science, ICT and Future Planning) of the Korea government. J.B. also acknowledge the support of the Institute of Information and Communications Technology Planning and Evaluation Grant funded by the Korea government (Grant No. 2020-0-00890, “Development of trusted node core and interfaces for the interoperability among QKD protocols” ). W.S. and J.B. acknowledge the research project on developing quantum machine learning and quantum algorithm (No. 2018-104) by the ETRI affiliated research institute. W.S. acknowledge the KIST research program (2E31021). N. L. acknowledges funding from the Shanghai Pujiang Talent Grant (no. 20PJ1408400) and the NSFC International Young Scientists Project (no. 12050410230). N. L. is also supported by the Innovation Program of the Shanghai Municipal Education Commission (no. 2021-01-07-00-02-E00087), the Shanghai Municipal Science and Technology Major Project (2021SHZDZX0102) and the Natural Science Foundation of Shanghai grant 21ZR1431000. M.W. and M.P. acknowledge the ICTQT IRAP project of FNP (Contract No. 2018/MAB/5), financed by structural funds of EU. M.W. was supported by NCN Grants 2015/19/B/ST2/01999 and 2017/26/E/ST2/01008. M.P. was supported under FNP Grant First Team/2016-1/5. J.K. was supported in part by KIAS Advanced Research Program (No. CG014604). J.B. was supported by a KIAS Individual Grant (No. CG061003).
Author information
Authors and Affiliations
Corresponding authors
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix A: Detailed calculations of \(P_Q(\omega )\)
Here, we present the procedure to calculate \(P_Q(\omega )\) in Eq. (6) of the main manuscript. We start by analyzing the simple case, i.e., of \(\omega =1\). In particular, we consider an input \(\mathbf {x} = x_1 x_2 \cdots x_n\) satisfying \(x_{l_1} = 1\) for arbitrary \(l_1 \in [1,n]\) and \(x_{j}=0\) for all \(j \ne l_1\). Subsequently, only two gates \({\hat{a}}_0\) and \({\hat{a}}_{l_1}\) are activated with \(\varOmega _\mathbf {x} = \{ 0, l_1 \}\). In a purely classical query, \(P_C(\omega =1)\) is given as
where \(\eta _k\) is the probability that a bit-flip error will occur at \({\hat{a}}_k\) (\(k \in \{0, l_1\}\)). Meanwhile, \(P_Q(\omega =1)\) is calculated as below:
where \(\hat{\epsilon }_k=\sqrt{1-\eta _k}\hat{1\!\!1} \pm i \sqrt{\eta _k}\hat{\sigma }_x\) is the error operation, defined in the main manuscript. Using the properties in Eq. (7) of the main manuscript, i.e., \(\hat{\sigma }_x {\hat{a}}_k = -{\hat{a}}_k \hat{\sigma }_x\) and \(\left<h^\star (\mathbf {x})\right| {\hat{a}}_{l_1}{\hat{a}}_{0}\left| \alpha \right>=1\), we can evaluate the following:
Subsequently, using Eq. (15), we can obtain
where
This factor \(\varGamma _{0, l_1}\) is from quantum superposition and clearly indicates the enhancement of the success probability with the condition \(\varGamma _{0, l_1} \ge 0\). In Fig. 3, we depict the graphs of \(P_{C,Q}\) with respect to \(\eta _0\) and \(\eta _{l_1}\). It is noteworthy that our hybrid oracle always yields correct results, i.e., \(P_Q=1\), provided that \(\eta _{l_1} = \eta _0\), even though \(\eta _{l_1}\) and \(\eta _{0}\) are large. This is the most remarkable feature in our classical–quantum hybrid query.
Subsequently, we consider the case of \(\omega =2\), where a set of four gates, \({\hat{a}}_0\), \({\hat{a}}_{l_1}\), \({\hat{a}}_{l_2}\), and \({\hat{a}}_{l_3}\), are to be activated with \(\varOmega _\mathbf {x}=\{ 0, l_1, l_2, l_3 \}\). We subsequently calculate \(P_Q(\omega =2)\) as follows:
To proceed with the calculation, we introduce an identity , where the state (\(\beta \in \{0,1\}\)) is defined with the following properties:
Using a mathematical method of substituting the identity \(\hat{1\!\!1}_{\beta ,\beta ^\perp }\) between \(\hat{\epsilon }_{l_3} {\hat{a}}_{l_3} \hat{\epsilon }_{l_2} {\hat{a}}_{l_2}\) and \(\hat{\epsilon }_{l_1} {\hat{a}}_{l_1} \hat{\epsilon }_{0} {\hat{a}}_{0}\) in Eq. (18), we can obtain
Furthermore, after some algebraic simplifications, we can arrive at
where
Here, \(\varGamma _{a, b}\) is defined as \(\varGamma _{a,b}=2\sqrt{1-\eta _{a}}\sqrt{1-\eta _{b}}\sqrt{\eta _{a}}\sqrt{\eta _{b}}\) for \(a \ne b \in \varOmega _\mathbf {x} =\{ 0, l_1, l_2, l_3 \}\), similarly to Eq. (17). Subsequently, using Eq. (20) and Eq. (21), we demonstrate that the quantum advantage can be achieved with the positive factors \(\varGamma _{a, b}\). Note that Eq. (21) could be negative thus exhibiting the disadvantage, e.g., when \(\eta _{l_3}=\eta _{l_1}=0\) or \(\eta _{l_2}=\eta _{l_1}=0\) for all input \(\mathbf {x}\). However, the aforementioned situation is not likely to occur in real physical systems. Consistent with the case of \(\omega =1\), we observed that \(P_Q(\omega =2)\) becomes unity when \(\eta _{l_3}=\eta _{l_2}=\eta _{l_1}=\eta _{0}\).
By observing the two cases above, we can infer that the same method, i.e., of introducing the identities, can be used to calculate \(P_Q(\omega )\) for arbitrary higher Hamming-weight inputs. The most remarkable construction, i.e., having unity query-success probability with equal error probabilities, can be generalized as well. Therefore, it can be sufficiently concluded that the enhancement in the query-success probability can be achieved for an arbitrary Hamming-weight in our hybrid query.
Appendix B: Numerical analyses with realistic conditions
As mentioned in the main manuscript, in a more realistic situation, the amplitudes related to the errors are not completely canceled out owing to a nonzero \(\varDelta _\eta \), and \(P_Q(\omega )\) exhibits an analogous form to \(P_C(\omega )\) in Eq. (4) of the main manuscript, with an “effective” characteristic constant \(c_\text {eff} \simeq (2\overline{\eta }_\text {eff})^{-1}\). Here, the effective average error \(\overline{\eta }_\text {eff}\) is expected to be much smaller than c. This feature results in the quantum advantage that does not depend on the degree of \(\overline{\eta }\) but only on \(\varDelta _\eta \), i.e., how “varying” they are.
To corroborate and extend our theoretical predictions, we perform a numerical analysis. It starts with an input \(\mathbf {x}\) of \(\omega (\mathbf {x})\). We subsequently evaluate \(P_{C,Q}(\mathbf {x})\) by counting the number of “\(h^\star (\mathbf {x})\)” (e.g.,, “success”) and “\(h^\star (\mathbf {x}) \oplus 1\)” (e.g.,, “failure”), such that \(P_{C,Q}(\mathbf {x}) = {N_S}/\left( {N_S + N_F}\right) \), where \(N_S\) and \(N_F\) denote the numbers of success and failure, respectively, and \(N_S + N_F = 10^5\). Here, we use the Monte-Carlo approach to mimic quantum measurement statistics. This simulation is repeated for different values of \(\eta _k\) (for \(k \in \varOmega _{\mathbf {x}}\)) satisfying \(c=(2\overline{\eta })^{-1}\). This condition enables us to analyze the data statistically (i.e., by averaging over the trials) without losing generality, even though in each simulation \(\eta _k\) is changed with different \(h^\star \). First, as an extreme but illustrative example, we consider the case of \(\varDelta _\eta =0\), i.e., by assuming \(\eta _k = \overline{\eta }\) for all possible \(k = 1, 2, \ldots , 2^n\). As results, we present the graphs of \(P_{C,Q}\) versus \(\omega \) as dots in Fig. 4a for \(\overline{\eta }=10^{-4}\), \(10^{-3}\), and \(10^{-2}\), where each data point of \(P_{C,Q}\) is obtained by averaging over \(\simeq 10^3\) trials. Here, it is observed that \(P_C\) decays fast to \(\frac{1}{2}\), indicating good agreement with Eq. (6) of the main manuscript. The data of \(P_Q\) are, meanwhile, shown to be unity without depending on the degree of \(\overline{\eta }\), as predicted. Next, we consider a realistic situation, assuming that \(\eta _k\) is drawn from a normal distribution \({{\mathcal {N}}}(\overline{\eta }, \varDelta _\eta )\) for all \(k = 1,2,\ldots ,2^n\) (and hence for \(k \in \varOmega _{\mathbf {x}}\)). Here, we set \(\overline{\eta }=10^{-3}\) with \(\varDelta _\eta = 1\%\), \(5\%\), and \(10\%~\text {of}~\overline{\eta }\). The simulation results are shown in Fig. 4b. For all cases of \(\varDelta _\eta \), both \(P_{C}\) and \(P_{Q}\) decay to \(\frac{1}{2}\); however, \(P_Q\) is much slower. It is also observed that the data of \(P_Q\) matched well with Eq. (6) of the main manuscript, thus allowing us to identify the effective characteristic constant \(c_\text {eff}\). The identified values of \(c_\text {eff}\) and \(\overline{\eta }_\text {eff}\) are listed in Table 2; they manifest the predicted condition in Eq. (8) of the main manuscript.
For a more realistic condition, we consider another type of error, i.e., phase-flip in the assistant qubit that would be crucial for maintaining a higher success rate of the query. In particular, we assume that the phase-flip errors primarily occur when the qubit travels between \({\hat{a}}_k\) and \({\hat{a}}_{k+1}\) with a certain probability \(\chi _k \le \frac{1}{2}\). First, when \(\varDelta _\eta =0\) (or equivalently, \(\eta _k = \overline{\eta }\)) for all k, the phase-flip errors do not affect the query process and \(P_{Q}\) becomes unity. In the realistic case, namely of \(\varDelta _\eta \ne 0\), however, it is predicted that the amplitudes of the bit-flip errors would interfere disorderly owing to the phase-flip, and eventually the quantum advantage becomes smaller, as described in our main manuscript. Thus, we perform the simulations and present the data of \(P_Q\) in Fig. 5. Here, \(\chi _k\) is assumed to be drawn from \({{\mathcal {N}}}(\overline{\chi }, \varDelta _\chi )\) for all \(k=1,2,\ldots ,2^n\). The simulation data are generated for \(\overline{\chi }=10^{-4}\), \(10^{-3}\) and \(10^{-2}\). Here, we set \(\overline{\eta }=10^{-3}\) with \(\varDelta _\eta = 5\%~\text {of}~\overline{\eta }\). The data are well fitted by Eq. (6) of our main manuscript, and \(c_\text {eff}\) are well estimated from the data (see Table 3). As expected, the quantum advantage becomes less pronounced as \(\overline{\chi }\) is increased; however, it can still exhibit a higher success rate of the query. It is noteworthy that the data obtained for both \(\varDelta _\chi = 1\%~\text {and}~10\%~\text {of}~\overline{\chi }\) are almost identical (up to the second digit of a decimal).
Appendix C: Reduction in learning sample complexity in the framework of probably-approximately-correct (PAC) learning
In a probably-approximately-correct (PAC) learning model, a learner (or equivalently, a learning algorithm) samples a finite set of training data \(\{ (\mathbf {x}_i, h^\star (\mathbf {x}_i)) \}\) (\(i = 1,2,\ldots ,M\)) by accessing an oracle, aiming at obtaining the best hypothesis h close to \(h^\star \) for a given set, e.g., H, of the hypothesis h. Here, \(\mathbf {x}_i\) is typically assumed to be drawn uniformly. Subsequently, a learning algorithm is a (\(\epsilon \), \(\delta \))-PAC learner (under uniform distribution), if the algorithm obtains an \(\epsilon \)-approximated correct h with probability \(1-\delta \); more specifically, satisfying
where \(E(h, h^\star )\) denotes the error. Here, if h identified by the algorithm agrees with
of samples constructed from the oracle, then Eq. (23) holds. Here, \(\left| H\right| \le 2^{2^n}\) denotes the cardinality of H. Equation (24) is known as the bound of the sample complexity [23, 24], i.e., it yields the minimum number of training samples to successfully learn \(h \in { H}\) satisfying Eq. (23). Such a sample complexity bound derived from the previous studies can directly be carried over to our scenario; in our classical–quantum hybrid query scheme, the same sample complexity bound exists, because \(\mathbf {x}_i\) and \(h^\star (\mathbf {x}_i)\) identified by the measurement on are classical.
However, in the case where the oracle is not perfect, the bound of sample complexity in Eq. (24) is modified as follows: First, we draw a sequence of the training data \(\{ (\mathbf {x}_1, m_1), (\mathbf {x}_2, m_2), \ldots , (\mathbf {x}_M, m_M) \}\) sampled from our classical–quantum hybrid oracle, where \(m_i \in \{ h^\star (\mathbf {x}_i), h^\star (\mathbf {x}_i) \oplus 1 \}\) denotes the outcome of the measurement performed on . Subsequently, if the sampling is performed with
we can verify that Eq. (23) holds for the algorithm that obtains h maximizing \({\overline{P}}_Q\). In fact, it has been proven that the additional factor \(A_Q\) is given as [30]
It is noteworthy that in the purely classical case, the corresponding factor, e.g., \(A_C\), is given with \({\overline{P}}_C\) instead of \({\overline{P}}_Q\). Thus, we can derive the reduction in the sample complexity with the condition \(A_Q \le A_C\) from \({\overline{P}}_Q \ge {\overline{P}}_C\). To view this explicitly, we rewrite \(A_Q\) in Eq. (26) to a more useful form:
This implies that a small increment in the sample complexity bound when n is small increases abruptly from near \(n \simeq 2\log _2{\gamma c}\). As \(A_C\) is characterized by c without \(\gamma \), we can interpret \(\gamma \) as a quantum learning advantage in the PAC learning framework; i.e., for any large n, we can define a (\(\epsilon \), \(\delta \))-PAC learner with our hybrid oracle, unlike with a fully classical one. It is noteworthy that if n is excessively large, i.e., when \(n \gg 2\log _2{\gamma c}\), it is impractical to define a legitimate PAC learner even with our hybrid oracle. This result is consistent with the recent theoretical study in Ref. [22]; however, in our case, such a quantum learning advantage is achieved with classical data.
To corroborate and extend our analysis, numerical simulations are performed: For a given n, we prepare a set of inputs \(\{\mathbf {x}_1, \mathbf {x}_2, \ldots , \mathbf {x}_M \}\) that is sampled randomly. For the given \(\omega (\mathbf {x}_i)\) (\(i=1,2,\ldots ,M\)), we evaluate \(P_{C,Q}(\omega )\) by counting \(10^5\) queries and identify their average value, i.e., \(\frac{1}{10^5} \sum _{i=1}^{10^5} P_{C,Q}(\omega (\mathbf {x}_i))\). This process is repeated \(\simeq 10^3\) times for different input sets to analyze \({\overline{P}}_{C,Q}(n)\) statistically. The data are generated from \(n=8\) to \(n=35\), assuming that \(\eta _k\) and \(\chi _k\) (\(\forall k\)) are drawn from \({{\mathcal {N}}}(\overline{\eta }, \varDelta _\eta )\) and \({{\mathcal {N}}}(\overline{\chi }, \varDelta _\chi )\), respectively. Here, we consider \(\overline{\eta }=10^{-3}\) with \(\varDelta _\eta = 0.05\overline{\eta }\) and \(\overline{\chi }=10^{-2}\) with \(\varDelta _\chi = 0.1\overline{\chi }\). In each simulation, M is fixed to 100. The obtained data agree well with our theoretical predictions (see Fig. 2a, b in the main manuscript).
Rights and permissions
About this article
Cite this article
Song, W., Wieśniak, M., Liu, N. et al. Tangible reduction in learning sample complexity with large classical samples and small quantum system. Quantum Inf Process 20, 275 (2021). https://doi.org/10.1007/s11128-021-03217-7
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11128-021-03217-7