Skip to main content

Quantum data compression by principal component analysis

Abstract

Data compression can be achieved by reducing the dimensionality of high-dimensional but approximately low-rank datasets, which may in fact be described by the variation of a much smaller number of parameters. It often serves as a preprocessing step to surmount the curse of dimensionality and to gain efficiency, and thus it plays an important role in machine learning and data mining. In this paper, we present a quantum algorithm that compresses an exponentially large high-dimensional but approximately low-rank dataset in quantum parallel, by dimensionality reduction (DR) based on principal component analysis (PCA), the most popular classical DR algorithm. We show that the proposed algorithm has a runtime polylogarithmic in the dataset’s size and dimensionality, which is exponentially faster than the classical PCA algorithm, when the original dataset is projected onto a polylogarithmically low-dimensional space. The compressed dataset can then be further processed to implement other tasks of interest, with significantly less quantum resources. As examples, we apply this algorithm to reduce data dimensionality for two important quantum machine learning algorithms, quantum support vector machine and quantum linear regression for prediction. This work demonstrates that quantum machine learning can be released from the curse of dimensionality to solve problems of practical importance.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

References

  1. Ma, Y.C., Yung, M.H.: Transforming Bell’s inequalities into state classifiers with machine learning. npj Quantum Inf 4, 34 (2018)

    ADS  Article  Google Scholar 

  2. Lu, S., Huang, S., Li, K., Li, J., Chen, J., Lu, D., Ji, Z., Shen, Y., Zhou, D., Zeng, B.: Separability-entanglement classifier via machine learning. Phys. Rev. A 98, 012315 (2018)

    ADS  Article  Google Scholar 

  3. Wiebe, N., Granade, C., Ferrie, C., Cory, D.G.: Quantum Hamiltonian learning using imperfect quantum resources. Phys. Rev. A 89, 042314 (2014)

    ADS  Article  Google Scholar 

  4. Wiebe, N., Granade, C., Ferrie, C., Cory, D.G.: Hamiltonian learning and certification using quantum resources. Phys. Rev. Lett. 112, 190501 (2014)

    ADS  Article  Google Scholar 

  5. Wang, J., et al.: Experimental quantum Hamiltonian learning. Nat. Phys. 13, 551 (2017)

    Article  Google Scholar 

  6. Bisio, A., Chiribella, G., D’Ariano, G.M., Facchini, S., Perinotti, P.: Optimal quantum learning of a unitary transformation. Phys. Rev. A 81, 032324 (2010)

    ADS  Article  Google Scholar 

  7. Bang, J., Ryu, J., Yoo, S., Pawłowski, M., Lee, J.: A strategy for quantum algorithm design assisted by machine learning. New J. Phys. 16, 073017 (2014)

    ADS  Article  Google Scholar 

  8. Harrow, A.W., Hassidim, A., Lloyd, S.: Quantum algorithm for linear systems of equations. Phys. Rev. Lett. 103, 150502 (2009)

    ADS  MathSciNet  Article  Google Scholar 

  9. Lloyd, S., Mohseni, M., Rebentrost, P.: Quantum algorithms for supervised and unsupervised machine learning. arXiv:1307.0411 (2013)

  10. Rebentrost, P., Mohseni, M., Lloyd, S.: Quantum support vector machine for big data classification. Phys. Rev. Lett. 113, 130503 (2014)

    ADS  Article  Google Scholar 

  11. Cong, I., Duan, L.: Quantum discriminant analysis for dimensionality reduction and classification. New J. Phys. 18, 073011 (2016)

    ADS  Article  Google Scholar 

  12. Schuld, M., Fingerhuth, M., Petruccione, F.: Implementing a distance-based classifier with a quantum interference circuit. Europhys. Lett. 119, 60002 (2017)

    ADS  Article  Google Scholar 

  13. Duan, B., Yuan, J., Liu, Y., Li, D.: Quantum algorithm for support matrix machines. Phys. Rev. A 96, 032301 (2017)

    ADS  Article  Google Scholar 

  14. Schuld, M., Bocharov, A., Svore, K., Wiebe, N.: Circuit-centric quantum classifiers. arXiv:1804.00633 (2018)

  15. Schuld, M., Petruccione, F.: Quantum ensembles of quantum classifiers. Sci. Rep. 8, 2772 (2018)

    ADS  Article  Google Scholar 

  16. Wiebe, N., Braun, D., Lloyd, S.: Quantum algorithm for data fitting. Phys. Rev. Lett. 109, 050505 (2012)

    ADS  Article  Google Scholar 

  17. Schuld, M., Sinayskiy, I., Petruccione, F.: Prediction by linear regression on a quantum computer. Phys. Rev. A 94, 022342 (2016)

    ADS  Article  Google Scholar 

  18. Wang, G.: Quantum algorithm for linear regression. Phys. Rev. A 96, 012335 (2017)

    ADS  MathSciNet  Article  Google Scholar 

  19. Yu, C.-H., Gao, F., Wen, Q.-Y.: Quantum algorithm for ridge regression. arXiv:1707.09524 (2017)

  20. Yu, C.-H., Gao, F., Liu, C., Huynh, D., Reynolds, M., Wang, J.: Quantum algorithm for visual tracking. Phys. Rev. A 99, 022301 (2019)

    ADS  Article  Google Scholar 

  21. Aïmeur, E.E., Brassard, G., Gambs, S.: Quantum speed-up for unsupervised learning. Mach. Learn. 90, 261 (2013)

    MathSciNet  Article  Google Scholar 

  22. Yu, C.-H., Gao, F., Wang, Q.-L., Wen, Q.-Y.: Quantum algorithm for association rules mining. Phys. Rev. A 94, 042311 (2016)

    ADS  Article  Google Scholar 

  23. Liu, N., Rebentrost, P.: Quantum machine learning for quantum anomaly detection. Phys. Rev. A 97, 042315 (2018)

    ADS  Article  Google Scholar 

  24. Cai, X.-D., Wu, D., Su, Z.-E., Chen, M.-C., Wang, X.-L., Li, L., Liu, N.-L., Lu, C.-Y., Pan, J.-W.: Entanglement-based machine learning on a quantum computer. Phys. Rev. Lett. 114, 110504 (2015)

    ADS  Article  Google Scholar 

  25. Li, Z., Liu, X., Xu, N., Du, J.: Experimental realization of a quantum support vector machine. Phys. Rev. Lett. 114, 140504 (2015)

    ADS  Article  Google Scholar 

  26. Dunjko, V., Briegel, H.J.: Machine learning & artificial intelligence in the quantum domain: a review of recent progress. Rep. Prog. Phys. 81, 074001 (2018)

    ADS  MathSciNet  Article  Google Scholar 

  27. Biamonte, J., Wittek, P., Pancotti, N., Rebentrost, P., Wiebe, N., Lloyd, S.: Quantum machine learning. Nature 549, 195–202 (2017)

    ADS  Article  Google Scholar 

  28. Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, New York (2006)

    MATH  Google Scholar 

  29. Géron, A.: Hands-On Machine Learning with Scikit-Learn and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems. O’Reilly Media Inc, Sebastopol (2017)

    Google Scholar 

  30. Lloyd, S., Mohseni, M., Rebentrost, P.: Quantum principal component analysis. Nat. Phys. 10, 631 (2014)

    Article  Google Scholar 

  31. Daskin, A.: Obtaining a linear combination of the principal components of a matrix on quantum computers. Quantum Inf. Process. 15, 4013 (2016)

    ADS  MathSciNet  Article  Google Scholar 

  32. Brassard, G., Høyer, P., Mosca, M., Tapp, A.: Quantum Amplitude Amplification and Estimation, Contemporary Mathematics Series Millenium, vol. 305. AMS, New York (2002)

    MATH  Google Scholar 

  33. Buhrman, H., Cleve, R., Watrous, J., de Wolf, R.: Quantum fingerprinting. Phys. Rev. Lett. 87, 167902 (2001)

    ADS  Article  Google Scholar 

  34. Rozema, L.A., Mahler, D.H., Hayat, A., Turner, P.S., SteinbergLee, A.M.: Quantum data compression of a qubit ensemble. Phys. Rev. Lett. 113, 160504 (2014)

    ADS  Article  Google Scholar 

  35. Yang, Y., Chiribella, G., Hayashi, M.: Optimal compression for identically prepared qubit states. Phys. Rev. Lett. 117, 090502 (2016)

    ADS  MathSciNet  Article  Google Scholar 

  36. Yang, Y., Chiribella, G., Ebler, D.: Efficient quantum compression for ensembles of identically prepared mixed states. Phys. Rev. Lett. 116, 080501 (2016)

    ADS  Article  Google Scholar 

  37. Zhou, N.R., Hua, T.X., Gong, L.H., Pei, D.J., Liao, Q.H.: Quantum image encryption based on generalized Arnold transform and double random-phase encoding. Quantum Inf. Process. 14, 1193 (2015)

    ADS  MathSciNet  Article  Google Scholar 

  38. Zhou, N., Yan, X., Liang, H., Tao, X., Li, G.: Multi-image encryption scheme based on quantum 3D Arnold transform and scaled Zhongtang chaotic system. Quantum Inf. Process. 17, 338 (2018)

    ADS  Article  Google Scholar 

  39. Giovannetti, V., Lloyd, S., Maccone, L.: Quantum random access memory. Phys. Rev. Lett. 100, 160501 (2008)

    ADS  MathSciNet  Article  Google Scholar 

  40. Kerenidis, I., Prakash, A.: Quantum recommendation systems. arXiv:1603.08675 (2016)

  41. Wossnig, L., Zhao, Z., Prakash, A.: Quantum linear system algorithm for dense matrices. Phys. Rev. Lett. 120, 050502 (2018)

    ADS  MathSciNet  Article  Google Scholar 

  42. Häner, T., Roetteler, M., Svore, K. M.: Optimizing quantum circuits for arithmetic. arXiv:1805.12445 (2018)

  43. Nielsen, M.A., Chuang, I.L.: Quantum Computation and Quantum Information. Cambridge University Press, Cambridge (2010)

    Book  Google Scholar 

  44. Barenco, A., Bennett, C.H., Cleve, R., DiVincenzo, D.P., Margolus, N., Shor, P., Sleator, T., Smolin, J.A., Weinfurter, H.: Elementary gates for quantum computation. Phys. Rev. A 52, 3457 (1995)

    ADS  Article  Google Scholar 

  45. Harrow, A.W., Montanaro, A., Short, A.J.: Limitations on quantum dimensionality reduction. Int. J. Quantum Inf. 13, 1440001 (2015)

    MathSciNet  Article  Google Scholar 

Download references

Acknowledgements

We thank Samuel Marsh, Amit Sett, Mitchell Chiew, Kooper De Lacy and Yuan Su for helpful discussions. This work is supported by NSFC (Grant Nos. 61572081, 61672110, and 61671082). C.-H. Yu is supported by China Scholarship Council.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Fei Gao.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix: Proof of Theorem 1

Appendix: Proof of Theorem 1

Proof

Let \(s_i\) denote support of \(\mathbf {x}_i\)’s normalized vector \(\left| \mathbf {x}_i \right\rangle \) in the subspace spanned by \(\{\left| \mathbf {v}_1 \right\rangle ,\ldots ,\left| \mathbf {v}_d \right\rangle \}\), i.e., \(s_i=\sum _{j=1}^d \left|\left\langle \mathbf {v}_j| \mathbf {x}_i \right\rangle \right|^2\), and \(N_1\) be the number of data points having support \(\ge \eta \vartheta \), i.e., \(N_1=\left|\{i:s_i \ge \eta \vartheta , i=1,\ldots ,N\}\right|\). According to Eqs. (1), (3) and (4), we have

$$\begin{aligned} \text {Tr}(XX^T)=\sum _{i=1}^N \left\| \mathbf {x}_i\right\| ^2 =\sum _{j=1}^D \sigma _j^2 \end{aligned}$$
(35)

and

$$\begin{aligned} \text {Tr}(XV_dV_d^TX^T)=\sum _{i=1}^N \left\| \mathbf {x}_i\right\| ^2s_i =\sum _{j=1}^d \sigma _j^2. \end{aligned}$$
(36)

So as required we should have

$$\begin{aligned} \frac{\sum \nolimits _{i=1}^N \left\| \mathbf {x}_i\right\| ^2s_i}{\sum \nolimits _{i=1}^N \left\| \mathbf {x}_i\right\| ^2}=\frac{\sum \nolimits _{j=1}^d \sigma _j^2}{\sum \nolimits _{j=1}^D \sigma _j^2}\ge \vartheta . \end{aligned}$$
(37)

Moreover,

$$\begin{aligned} \frac{\sum \nolimits _{i=1}^N\left\| \mathbf {x}_i\right\| ^2s_i}{\sum \nolimits _{i=1}^N \left\| \mathbf {x}_i\right\| ^2}&\le \frac{\sum \nolimits _{s_i < \eta \vartheta }\left\| \mathbf {x}_i\right\| ^2\eta \vartheta +\sum \nolimits _{s_i \ge \eta \vartheta }\left\| \mathbf {x}_i\right\| ^2}{\sum \nolimits _{i=1}^N \left\| \mathbf {x}_i\right\| ^2} \end{aligned}$$
(38)
$$\begin{aligned}&=\frac{1+\left( \frac{\sum \nolimits _{s_i< \eta \vartheta }\left\| \mathbf {x}_i\right\| ^2}{\sum \nolimits _{s_i \ge \eta \vartheta }\left\| \mathbf {x}_i\right\| ^2}\right) \eta \vartheta }{1+\left( \frac{\sum \nolimits _{s_i < \eta \vartheta }\left\| \mathbf {x}_i\right\| ^2}{\sum \nolimits _{s_i \ge \eta \vartheta }\left\| \mathbf {x}_i\right\| ^2}\right) } \end{aligned}$$
(39)
$$\begin{aligned}&\le \frac{1+\left( \frac{N-N_1}{N_1\kappa ^2}\right) \eta \vartheta }{1+\left( \frac{N-N_1}{N_1\kappa ^2}\right) } \end{aligned}$$
(40)

since \(\frac{\sum _{s_i < \eta \vartheta }\left\| \mathbf {x}_i\right\| ^2}{\sum _{s_i \ge \eta \vartheta }\left\| \mathbf {x}_i\right\| ^2}\ge \frac{(N-N_1)\left\| \mathbf {x}\right\| _{\min }^2}{N_1\left\| \mathbf {x}\right\| _{\max }^2} \ge \frac{N-N_1}{N_1\kappa ^2}\). Combining Eqs. (37) and (40), we can derive

$$\begin{aligned} \frac{N_1}{N}\ge \frac{(1-\eta )\vartheta }{\kappa ^2(1-\vartheta )+(1-\eta )\vartheta }, \end{aligned}$$
(41)

which means we can with probability \(\ge \frac{(1-\eta )\vartheta }{\kappa ^2(1-\vartheta )+(1-\eta )\vartheta }\) randomly pick out a data point \(\mathbf {x}\in \{\mathbf {x}_i\}_{i=1}^N\) whose normalized vector \(\left| \mathbf {x} \right\rangle \) has support \(\ge \eta \vartheta \) in the subspace spanned by \(\{\left| \mathbf {v}_1 \right\rangle ,\ldots ,\left| \mathbf {v}_d \right\rangle \}\). \(\square \)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Yu, CH., Gao, F., Lin, S. et al. Quantum data compression by principal component analysis. Quantum Inf Process 18, 249 (2019). https://doi.org/10.1007/s11128-019-2364-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11128-019-2364-9

Keywords

  • Quantum algorithm
  • Data compression
  • Principal component analysis
  • Quantum machine learning
  • Curse of dimensionality