Skip to main content
Log in

On the complexity of some quadratic Euclidean 2-clustering problems

  • Published:
Computational Mathematics and Mathematical Physics Aims and scope Submit manuscript

Abstract

Some problems of partitioning a finite set of points of Euclidean space into two clusters are considered. In these problems, the following criteria are minimized: (1) the sum over both clusters of the sums of squared pairwise distances between the elements of the cluster and (2) the sum of the (multiplied by the cardinalities of the clusters) sums of squared distances from the elements of the cluster to its geometric center, where the geometric center (or centroid) of a cluster is defined as the mean value of the elements in that cluster. Additionally, another problem close to (2) is considered, where the desired center of one of the clusters is given as input, while the center of the other cluster is unknown (is the variable to be optimized) as in problem (2). Two variants of the problems are analyzed, in which the cardinalities of the clusters are (1) parts of the input or (2) optimization variables. It is proved that all the considered problems are strongly NP-hard and that, in general, there is no fully polynomial-time approximation scheme for them (unless P = NP).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. M. Bern and D. Eppstein, “Approximation algorithms for geometric problems,” in Approximation Algorithms for NP-Hard Problems, Ed. by D. S. Hochbaum (PWS, Boston, 1997), pp. 296–345.

    Google Scholar 

  2. M. R. Garey and D. S. Johnson, Computers and Intractability: A Guide to the Theory of NP-Completeness (Freeman, San Francisco, 1979).

    MATH  Google Scholar 

  3. M. C. Bishop, Pattern Recognition and Machine Learning (Springer Science, New York, 2006).

    MATH  Google Scholar 

  4. G. James, D. Witten, T. Hastie, and R. Tibshirani, An Introduction to Statistical Learning (Springer Science, New York, 2013).

    Book  MATH  Google Scholar 

  5. P. Flach, Machine Learning: The Art and Science of Algorithms That Make Sense of Data (Cambridge University Press, New York, 2012).

    Book  MATH  Google Scholar 

  6. S. Sahni and T. Gonzalez, “P-complete approximation problems,” J. ACM 23, 555–566 (1976).

    Article  MathSciNet  MATH  Google Scholar 

  7. P. Brucker, “On the complexity of clustering problems,” Lect. Notes Econ. Math. Syst. 157, 45–54 (1978).

    Article  MathSciNet  MATH  Google Scholar 

  8. F. de la Vega and C. Kenyon, “A randomized approximation scheme for metric max-cut,” J. Comput. Sci. 63, 531–541 (2001).

    MathSciNet  MATH  Google Scholar 

  9. F. de la Vega, M. Karpinski, C. Kenyon, and Y. Rabani, “Polynomial time approximation schemes for metric min-sum clustering,” Electronic Colloquium on Computational Complexity (ECCC), Report No. 25 (2002).

  10. A. V. Kel’manov and A. V. Pyatkin, “On the complexity of a search for a subset of “similar” vectors,” Dokl. Math 78 (1), 574–575 (2008).

    Article  MathSciNet  MATH  Google Scholar 

  11. A. V. Kel’manov and A. V. Pyatkin, “Complexity of certain problems of searching for subsets of vectors and cluster analysis,” Comput. Math. Math. Phys. 49 (11), 1966–1971 (2009).

    Article  MathSciNet  MATH  Google Scholar 

  12. R. A. Fisher, Statistical Methods and Scientific Inference (Hafner, New York, 1956).

    MATH  Google Scholar 

  13. A. E. Galashov and A. V. Kel’manov, “A 2-approximate algorithm to solve one problem of the family of disjoint vector subsets,” Autom. Remote Control 75 (4), 595–606 (2014).

    Article  MathSciNet  MATH  Google Scholar 

  14. M. Inaba, N. Katoh, and H. Imai, “Applications of weighted Voronoi diagrams and randomization to variancebased k-clustering: (extended abstract),” Proceedings of the 10th ACM Symposium on Computational Geometry (Stony Brook, New York, 1994), pp. 332–339.

    Google Scholar 

  15. D. Aloise, A. Deshpande, P. Hansen, and P. Popat, “NP-hardness of Euclidean sum-of-squares clustering,” Machine Learning 75 (2), 245–248 (2009).

    Article  Google Scholar 

  16. A. V. Kel’manov, “On the complexity of some data analysis problems,” Comput. Math. Math. Phys. 50 (11), 1941–1947 (2010).

    Article  MathSciNet  MATH  Google Scholar 

  17. A. V. Kel’manov, “On the complexity of some cluster analysis problems,” Comput. Math. Math. Phys. 51 (11), 1983–1988 (2011).

    Article  MathSciNet  Google Scholar 

  18. A. A. Ageev, A. V. Kel’manov, and A. V. Pyatkin, “NP-hardness of the Euclidean max-cut problem,” Dokl. Math. 89 (3), 343–345 (2014).

    Article  MathSciNet  MATH  Google Scholar 

  19. A. A. Ageev, A. V. Kel’manov, and A. V. Pyatkin, “Complexity of the weighted max-cut in Euclidean space,” J. Appl. Ind. Math. 8 (4), 453–457 (2014).

    Article  MathSciNet  MATH  Google Scholar 

  20. T. N. Bui, S. Chaudhuri, F. T. Leighton, and M. Sipser, “Graph bisection algorithms with good average case behavior,” Combinatorica 7 (2), 171–191 (1987).

    Article  MathSciNet  Google Scholar 

  21. V. V. Vazirani, Approximation Algorithms (Springer, Berlin, 2001).

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to A. V. Kel’manov.

Additional information

Original Russian Text © A.V. Kel’manov, A.V. Pyatkin, 2016, published in Zhurnal Vychislitel’noi Matematiki i Matematicheskoi Fiziki, 2016, Vol. 56, No. 3, pp. 498–504.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kel’manov, A.V., Pyatkin, A.V. On the complexity of some quadratic Euclidean 2-clustering problems. Comput. Math. and Math. Phys. 56, 491–497 (2016). https://doi.org/10.1134/S096554251603009X

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1134/S096554251603009X

Keywords

Navigation