Skip to main content
Log in

Balanced k-means clustering on an adiabatic quantum computer

  • Published:
Quantum Information Processing Aims and scope Submit manuscript

Abstract

Adiabatic quantum computers are a promising platform for efficiently solving challenging optimization problems. Therefore, many are interested in using these computers to train computationally expensive machine learning models. We present a quantum approach to solving the balanced k-means clustering training problem on the D-Wave 2000Q adiabatic quantum computer. In order to do this, we formulate the training problem as a quadratic unconstrained binary optimization (QUBO) problem. Unlike existing classical algorithms, our QUBO formulation targets the global solution to the balanced k-means model. We test our approach on a number of small problems and observe that despite the theoretical benefits of the QUBO formulation, the clustering solution obtained by a modern quantum computer is usually inferior to the solution obtained by the best classical clustering algorithms. Nevertheless, the solutions provided by the quantum computer do exhibit some promising characteristics. We also perform a scalability study to estimate the run time of our approach on large problems using future quantum hardware. As a final proof of concept, we used the quantum approach to cluster random subsets of the Iris benchmark data set.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

References

  1. Aloise, D., Amit, D., Hansen, P., et al.: NP-hardness of Euclidean sum-of-squares clustering. Mach. Learn. (2009). https://doi.org/10.1007/s10994-009-5103-0

    Article  MATH  Google Scholar 

  2. Blum, A., Rivest, R.L.: Training a 3-node neural network is NP-complete. In: Proceedings of the First Annual Workshop on Computational Learning Theory, pp. 9–18. Morgan Kaufmann Publishers Inc., Cambridge (1988). https://doi.org/10.1016/S0893-6080(05)80010-3

  3. Hyafil, L., Rivest, R.: Constructing optimal binary decision trees is NP-complete. Inf. Process. Lett. (1976). https://doi.org/10.1016/0020-0190(76)90095-8

    Article  MathSciNet  MATH  Google Scholar 

  4. Willsch, D., Willsch, M., De Raedt, H., et al.: Support vector machines on the D-Wave quantum annealer. Comput. Phys. Commun. (2020). https://doi.org/10.1016/j.cpc.2019.107006

    Article  MathSciNet  Google Scholar 

  5. Dixit, V., Selvarajan, R., Alam, M.A., et al.: Training and classification using a restricted Boltzmann machine on the D-Wave 2000Q (2020). arXiv: 2005.03247

  6. Date P., Schuman C., Patton R., et al.: A classical-quantum hybrid approach for unsupervised probabilistic machine learning. In: Advances in Information and Communication. FICC 2019. Lecture Notes in Networks and Systems, vol. 70. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-12385-7_9

  7. Date, P., Potok, T.: Adiabatic quantum linear regression (2020). arXiv:2008.02355

  8. O’Malley, D., Vesselinov, V., Alexandrov, B., et al.: Nonnegative/binary matrix factorization with a D-Wave quantum annealer. PLoS ONE (2018). https://doi.org/10.1371/journal.pone.0206653

  9. Preskill, J.: Quantum computing in the NISQ era and beyond. Quantum 2, 79 (2018)

    Article  Google Scholar 

  10. Gupta, G., Younis, M.: Load-balanced clustering of wireless sensor networks. In: IEEE International Conference on Communications, pp. 1848-1852. IEEE (2003).https://doi.org/10.1109/ICC.2003.1203919

  11. Ghosh, J., Strehl, A.: Clustering and visualization of retail market baskets. In: Pal, N.R., Jain, L. (eds.) Advanced Techniques in Knowledge Discovery and Data Mining. Advanced Information and Knowledge Processing. Springer, London (2005). https://doi.org/10.1007/1-84628-183-0_3

  12. Banerjee, A., Ghosh, J.: Competitive learning mechanisms for scalable, incremental and balanced clustering of streaming texts. In: Proceedings of the International Joint Conference on Neural Networks, pp. 2697–2702. IEEE, Portland (2003). https://doi.org/10.1109/IJCNN.2003.1223993

  13. Hartigan, J.A., Wong, M.A.: Algorithm AS 136: a k-means clustering algorithm. J. R. Stat. Soc.: Ser. C Appl. Stat. (1979). https://doi.org/10.2307/2346830

    Article  MATH  Google Scholar 

  14. Arthur, D., Vassilvitskii, S.: How slow is the k-means method? In: Proceedings of the Twenty-Second Annual Symposium on Computational Geometry Association for Computing Machinery, pp. 144–153. ACM, New York (2006). https://doi.org/10.1145/1137856.1137880

  15. Na, S., Xumin, L., Yong, G.: Research on k-means clustering algorithm: an improved k-means clustering algorithm. In: 2010 Third International Symposium on Intelligent Information Technology and Security Informatics, pp. 63–67 (2010). https://doi.org/10.1109/IITSI.2010.74

  16. Celebi, M., Kingravi, H., Vela, P.: A comparative study of efficient initialization methods for the k-means clustering algorithm. Expert Syst. Appl. (2013). https://doi.org/10.1016/j.eswa.2012.07.021

    Article  Google Scholar 

  17. Kapoor, A., Singhal, A.: A comparative study of k-means, k-means++ and fuzzy c-means clustering algorithms. In: 2017 3rd International Conference on Computational Intelligence Communication Technology (CICT), pp. 1–6 (2017). https://doi.org/10.1109/CIACT.2017.7977272

  18. Pedregosa, F., Varoquaux, G., Gramfort, A., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. (2011). https://doi.org/10.5555/1953048.2078195

    Article  MathSciNet  MATH  Google Scholar 

  19. Bennett, K.P., Bradley, P.S., Demiriz, A.: Constrained k-means clustering (2000)

  20. Ganganath, N., Cheng, C., Tse, C.K.: Data clustering with cluster size constraints using a modified k-Means algorithm. In: 2014 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery, pp. 158–161. IEEE, China (2014). https://doi.org/10.1109/CyberC.2014.36

  21. Malinen, M.I., Fränti, P.: Balanced k-means for clustering. In: Fränti, P., Brown, G., Loog, M., Escolano, F., Pelillo, M. (eds.) Structural, Syntactic, and Statistical Pattern Recognition, pp. 32–41. Springer, Berlin (2014). https://doi.org/10.1007/978-3-662-44415-3_4

    Chapter  Google Scholar 

  22. Khan, S.U., Awan, A.J., Vall-Llosera, G.: K-Means clustering on noisy intermediate scale quantum computers (2019). arXiv:1909.12183

  23. Ushijima-Mwesigwa, H., Negre, C., Mniszewski, S.: Graph partitioning using quantum annealing on the D-Wave system. In: Proceedings of the Second International Workshop on Post Moores Era Supercomputing, pp. 22–29. ACM, New York (2017)

  24. Neukart, F., Dollen, D., Seidel, C.: Quantum-assisted cluster analysis. Front. Phys. (2018). https://doi.org/10.3389/fphy.2018.00055

    Article  Google Scholar 

  25. Wereszczyński, K., Michalczuk, A., Josiński, H., et al.: Quantum computing for clustering big datasets. In: 2018 Applications of Electromagnetics in Modern Techniques and Medicine (PTZE), pp. 276–280 (2018). https://doi.org/10.1109/PTZE.2018.8503109

  26. Bauckhage, C., Ojeda, C., Sifa, R., Wrobel, S.: Adiabatic quantum computing for kernel k = 2 means clustering. In: LWDA (2018)

  27. Bauckhage, C., Piatkowski, N, Sifa, R., et al.: A QUBO formulation of the k-Medoids Problem. In: LWDA (2019)

  28. Kumar, V., Bass, G., Tomlin, C., et al.: Quantum annealing for combinatorial clustering. Quantum Inf. Process. (2018). https://doi.org/10.1007/s11128-017-1809-2

    Article  MathSciNet  MATH  Google Scholar 

  29. Date, P., Arthur, D., Lauren, P.: QUBO formulations for training machine learning models. Sci. Rep. (2021). https://doi.org/10.1038/s41598-021-89461-4

    Article  Google Scholar 

  30. Date, P., Patton, R., Schuman, C., et al.: Efficiently embedding QUBO problems on adiabatic quantum computers. Quantum Inf. Process. 18, 117 (2019). https://doi.org/10.1007/s11128-019-2236-3

    Article  ADS  MATH  Google Scholar 

  31. Inaba, M., Katoh, N., Imai, H.: Applications of weighted Voronoi diagrams and randomization to variance-based k-clustering. In: Proceedings of the Tenth Annual Symposium on Computational Geometry, pp. 332–339. ACM, New York (1994). https://doi.org/10.1145/177424.178042

Download references

Acknowledgements

This manuscript has been authored by UT-Battelle, LLC under Contract No. DE-AC05-00OR22725 with the U.S. Department of Energy. The United States Government retains and the publisher, by accepting the article for publication, acknowledges that the United States Government retains a non-exclusive, paid-up, irrevocable, world-wide license to publish or reproduce the published form of this manuscript, or allow others to do so, for United States Government purposes. The Department of Energy will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (http://energy.gov/downloads/doe-public-access-plan).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Davis Arthur.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Arthur, D., Date, P. Balanced k-means clustering on an adiabatic quantum computer. Quantum Inf Process 20, 294 (2021). https://doi.org/10.1007/s11128-021-03240-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11128-021-03240-8

Keywords

Navigation