Skip to main content

Advertisement

Log in

Quantum annealing for combinatorial clustering

  • Published:
Quantum Information Processing Aims and scope Submit manuscript

Abstract

Clustering is a powerful machine learning technique that groups “similar” data points based on their characteristics. Many clustering algorithms work by approximating the minimization of an objective function, namely the sum of within-the-cluster distances between points. The straightforward approach involves examining all the possible assignments of points to each of the clusters. This approach guarantees the solution will be a global minimum; however, the number of possible assignments scales quickly with the number of data points and becomes computationally intractable even for very small datasets. In order to circumvent this issue, cost function minima are found using popular local search-based heuristic approaches such as k-means and hierarchical clustering. Due to their greedy nature, such techniques do not guarantee that a global minimum will be found and can lead to sub-optimal clustering assignments. Other classes of global search-based techniques, such as simulated annealing, tabu search, and genetic algorithms, may offer better quality results but can be too time-consuming to implement. In this work, we describe how quantum annealing can be used to carry out clustering. We map the clustering objective to a quadratic binary optimization problem and discuss two clustering algorithms which are then implemented on commercially available quantum annealing hardware, as well as on a purely classical solver “qbsolv.” The first algorithm assigns N data points to K clusters, and the second one can be used to perform binary clustering in a hierarchical manner. We present our results in the form of benchmarks against well-known k-means clustering and discuss the advantages and disadvantages of the proposed techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Ben-Dor, A., Shamir, R., Yakhini, Z.: Clustering gene expression patterns. J. Comput. Biol. 6(3–4), 281 (1999)

    Article  Google Scholar 

  2. Das, R., Saha, S.: 2016 IEEE Congress on (IEEE, 2016) in Evolutionary Computation (CEC), pp. 3124–3130

  3. Gorzałczany, M.B., Rudzínski, F., Piekoszewski, J.: 2016 International Joint Conference on (IEEE, 2016) in Neural Networks (IJCNN), pp. 3666–3673

  4. Marisa, L., de Reyniès, A., Duval, A., Selves, J., Gaub, M.P., Vescovo, L., Etienne-Grimaldi, M.C., Schiappa, R., Guenot, D., Ayadi, M., et al.: Gene expression classification of colon cancer into molecular subtypes: characterization, validation, and prognostic value. PLoS Med. 10(5), e1001453 (2013)

    Article  Google Scholar 

  5. Xie, P., Xing, E.P.: CoRR abs/1309.6874. http://arxiv.org/abs/1309.6874 (2013)

  6. Balabantaray, R.C., Sarma, C., Jha, M.: CoRR abs/1502.07938. http://arxiv.org/abs/1502.07938 (2015)

  7. Mudambi, S.: Branding importance in business-to-business markets: three buyer clusters. Ind. Mark. Manag. 31(6), 525 (2002)

    Article  Google Scholar 

  8. Sharma, A., Lambert, D.M.: Segmentation of markets based on customer service. Int. J. Phys. Distrib. Logist. Manag. 24(4), 50–58 (1994)

    Article  Google Scholar 

  9. Chan, K.Y., Kwong, C., Hu, B.Q.: Market segmentation and ideal point identification for new product design using fuzzy data compression and fuzzy clustering methods. Appl. Soft Comput. 12(4), 1371 (2012)

    Article  Google Scholar 

  10. Friedman, J., Hastie, T., Tibshirani, R.: The Elements of Statistical Learning, vol. 1. Springer, New York (2001)

    MATH  Google Scholar 

  11. Hartigan, J.A., Wong, M.A.: Algorithm AS 136: a k-means clustering algorithm. J. R. Stat. Soc. Ser. C (Appl. Stat.) 28(1), 100 (1979)

    MATH  Google Scholar 

  12. Johnson, S.C.: Hierarchical clustering schemes. Psychometrika 32(3), 241 (1967)

    Article  MATH  Google Scholar 

  13. Jain, A.K.: Data clustering: 50 years beyond K-means. Pattern Recogn. Lett. 31(8), 651 (2010)

    Article  Google Scholar 

  14. Garey, M.R., Johnson, D.S.: Computers and Intractability: a guide to the theory of NP-completeness. W. H. Freeman & Co., New York (1979)

    MATH  Google Scholar 

  15. Papadimitriou, C.H.: The Euclidean travelling salesman problem is NP-complete. Theor. Comput. Sci. 4(3), 237 (1977)

    Article  MATH  Google Scholar 

  16. Al-Sultana, K.S., Khan, M.M.: Computational experience on four algorithms for the hard clustering problem. Pattern Recogn. Lett. 17(3), 295 (1996)

    Article  Google Scholar 

  17. Kirkpatrick, S., Gelatt, C.D., Vecchi, M.P., et al.: Optimization by simulated annealing. Science 220(4598), 671 (1983)

    Article  ADS  MathSciNet  MATH  Google Scholar 

  18. Selim, S.Z., Alsultan, K.: A simulated annealing algorithm for the clustering problem. Pattern Recogn. 24(10), 1003 (1991)

    Article  MathSciNet  Google Scholar 

  19. Mitra, D., Romeo, F., Sangiovanni-Vincentelli, A.: 1985 24th IEEE Conference on Decision and Control, vol. 24, pp. 761–767. IEEE (1985)

  20. Szu, H., Hartley, R.: Fast simulated annealing. Phys. Lett. A 122(3–4), 157 (1987)

    Article  ADS  Google Scholar 

  21. Ingber, L.: Very fast simulated re-annealing. Math. Comput. Model. 12(8), 967 (1989)

    Article  MathSciNet  MATH  Google Scholar 

  22. Bouleimen, K., Lecocq, H.: A new efficient simulated annealing algorithm for the resource-constrained project scheduling problem and its multiple mode version. Eur. J. Oper. Res. 149(2), 268 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  23. Kadowaki, T., Nishimori, H.: Quantum annealing in the transverse Ising model. Phys. Rev. E 58(5), 5355 (1998)

    Article  ADS  Google Scholar 

  24. Santoro, G.E., Tosatti, E.: Optimization using quantum mechanics: quantum annealing through adiabatic evolution. J. Phys. A Math. Gen. 39(36), R393 (2006)

    Article  ADS  MathSciNet  MATH  Google Scholar 

  25. Denchev, V.S., Boixo, S., Isakov, S.V., Ding, N., Babbush, R., Smelyanskiy, V., Martinis, J., Neven, H.: What is the computational value of finite-range tunneling? Phys. Rev. X 6(3), 031015 (2016)

    Google Scholar 

  26. Born, M., Fock, V.: Beweis des Adiabatensatzes. Z. Angew. Phys. 51, 165 (1928). https://doi.org/10.1007/BF01343193

    MATH  Google Scholar 

  27. Albash, T., Lidar, D.A.: ArXiv e-prints (2016)

  28. Biamonte, J., Wittek, P., Pancotti, N., Rebentrost, P., Wiebe, N., Lloyd, S.: ArXiv e-prints (2016)

  29. Dulny, J., III, Kim, M.: ArXiv e-prints (2016)

  30. Neven, H., Denchev, V.S., Drew-Brook, M., Zhang, J., Macready, W.G., Rose, G.: Binary classification using hardware implementation of quantum annealing. In: Demonstrations at NIPS-09, 24th Annual Conference on Neural Information Processing Systems, pp. 1–17 (2009)

  31. Denchev, V.S.: Binary Classification with Adiabatic Quantum Optimization. Ph.D. thesis, Purdue University (2013)

  32. Farinelli, A.: Theory and Practice of Natural Computing: 5th International Conference, TPNC 2016, Sendai, Japan, December 12–13, 2016, Proceedings, vol. 10071, p. 175. Springer (2016)

  33. Kurihara, K., Tanaka, S., Miyashita, S.: Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence, pp. 321–328. AUAI Press (2009)

  34. Sato, I., Tanaka, S., Kurihara, K., Miyashita, S., Nakagawa, H.: Quantum annealing for Dirichlet process mixture models with applications to network clustering. Neurocomputing 121, 523 (2013)

    Article  Google Scholar 

  35. Ising, E.: Zeitschrift für Physik 31(1), 253 (1925). https://doi.org/10.1007/BF02980577

    Article  ADS  Google Scholar 

  36. Dahl, E.D.: Programming with d-wave: map coloring problem. D-Wave Official Whitepaper (2013)

  37. Ishikawa, H.: Transformation of general binary MRF minimization to the first-order case. IEEE Trans. Pattern Anal. Mach. Intell. 33(6), 1234 (2011). https://doi.org/10.1109/TPAMI.2010.91

    Article  Google Scholar 

  38. Booth, M., Reinhardt, S.P., Roy, A.: Partitioning optimization problems for hybrid classical/quantum execution. Technical Report, pp. 1–9 (2017)

  39. Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E.: Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825 (2011)

    MathSciNet  MATH  Google Scholar 

  40. Arthur, D., Vassilvitskii, S.: Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 1027–1035. Society for Industrial and Applied Mathematics (2007)

  41. Savaresi, S.M., Boley, D.L.: Proceedings of the 2001 SIAM International Conference on Data Mining, pp. 1–14. SIAM (2001)

  42. Cai, J., Macready, W.G., Roy, A.: arXiv preprint arXiv:1406.2741 (2014)

  43. Guénoche, A., Hansen, P., Jaumard, B.: Efficient algorithms for divisive hierarchical clustering with the diameter criterion. J. Classif. 8(1), 5 (1991). https://doi.org/10.1007/BF02616245

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

We acknowledge the support of the Universities Space Research Association, Quantum AI Lab Research Opportunity Program, Cycle 2.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Vaibhaw Kumar.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kumar, V., Bass, G., Tomlin, C. et al. Quantum annealing for combinatorial clustering. Quantum Inf Process 17, 39 (2018). https://doi.org/10.1007/s11128-017-1809-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11128-017-1809-2

Keywords

Navigation