Abstract
Some problems of partitioning a finite set of points of Euclidean space into two clusters are considered. In these problems, the following criteria are minimized: (1) the sum over both clusters of the sums of squared pairwise distances between the elements of the cluster and (2) the sum of the (multiplied by the cardinalities of the clusters) sums of squared distances from the elements of the cluster to its geometric center, where the geometric center (or centroid) of a cluster is defined as the mean value of the elements in that cluster. Additionally, another problem close to (2) is considered, where the desired center of one of the clusters is given as input, while the center of the other cluster is unknown (is the variable to be optimized) as in problem (2). Two variants of the problems are analyzed, in which the cardinalities of the clusters are (1) parts of the input or (2) optimization variables. It is proved that all the considered problems are strongly NP-hard and that, in general, there is no fully polynomial-time approximation scheme for them (unless P = NP).
Similar content being viewed by others
References
M. Bern and D. Eppstein, “Approximation algorithms for geometric problems,” in Approximation Algorithms for NP-Hard Problems, Ed. by D. S. Hochbaum (PWS, Boston, 1997), pp. 296–345.
M. R. Garey and D. S. Johnson, Computers and Intractability: A Guide to the Theory of NP-Completeness (Freeman, San Francisco, 1979).
M. C. Bishop, Pattern Recognition and Machine Learning (Springer Science, New York, 2006).
G. James, D. Witten, T. Hastie, and R. Tibshirani, An Introduction to Statistical Learning (Springer Science, New York, 2013).
P. Flach, Machine Learning: The Art and Science of Algorithms That Make Sense of Data (Cambridge University Press, New York, 2012).
S. Sahni and T. Gonzalez, “P-complete approximation problems,” J. ACM 23, 555–566 (1976).
P. Brucker, “On the complexity of clustering problems,” Lect. Notes Econ. Math. Syst. 157, 45–54 (1978).
F. de la Vega and C. Kenyon, “A randomized approximation scheme for metric max-cut,” J. Comput. Sci. 63, 531–541 (2001).
F. de la Vega, M. Karpinski, C. Kenyon, and Y. Rabani, “Polynomial time approximation schemes for metric min-sum clustering,” Electronic Colloquium on Computational Complexity (ECCC), Report No. 25 (2002).
A. V. Kel’manov and A. V. Pyatkin, “On the complexity of a search for a subset of “similar” vectors,” Dokl. Math 78 (1), 574–575 (2008).
A. V. Kel’manov and A. V. Pyatkin, “Complexity of certain problems of searching for subsets of vectors and cluster analysis,” Comput. Math. Math. Phys. 49 (11), 1966–1971 (2009).
R. A. Fisher, Statistical Methods and Scientific Inference (Hafner, New York, 1956).
A. E. Galashov and A. V. Kel’manov, “A 2-approximate algorithm to solve one problem of the family of disjoint vector subsets,” Autom. Remote Control 75 (4), 595–606 (2014).
M. Inaba, N. Katoh, and H. Imai, “Applications of weighted Voronoi diagrams and randomization to variancebased k-clustering: (extended abstract),” Proceedings of the 10th ACM Symposium on Computational Geometry (Stony Brook, New York, 1994), pp. 332–339.
D. Aloise, A. Deshpande, P. Hansen, and P. Popat, “NP-hardness of Euclidean sum-of-squares clustering,” Machine Learning 75 (2), 245–248 (2009).
A. V. Kel’manov, “On the complexity of some data analysis problems,” Comput. Math. Math. Phys. 50 (11), 1941–1947 (2010).
A. V. Kel’manov, “On the complexity of some cluster analysis problems,” Comput. Math. Math. Phys. 51 (11), 1983–1988 (2011).
A. A. Ageev, A. V. Kel’manov, and A. V. Pyatkin, “NP-hardness of the Euclidean max-cut problem,” Dokl. Math. 89 (3), 343–345 (2014).
A. A. Ageev, A. V. Kel’manov, and A. V. Pyatkin, “Complexity of the weighted max-cut in Euclidean space,” J. Appl. Ind. Math. 8 (4), 453–457 (2014).
T. N. Bui, S. Chaudhuri, F. T. Leighton, and M. Sipser, “Graph bisection algorithms with good average case behavior,” Combinatorica 7 (2), 171–191 (1987).
V. V. Vazirani, Approximation Algorithms (Springer, Berlin, 2001).
Author information
Authors and Affiliations
Corresponding author
Additional information
Original Russian Text © A.V. Kel’manov, A.V. Pyatkin, 2016, published in Zhurnal Vychislitel’noi Matematiki i Matematicheskoi Fiziki, 2016, Vol. 56, No. 3, pp. 498–504.
Rights and permissions
About this article
Cite this article
Kel’manov, A.V., Pyatkin, A.V. On the complexity of some quadratic Euclidean 2-clustering problems. Comput. Math. and Math. Phys. 56, 491–497 (2016). https://doi.org/10.1134/S096554251603009X
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1134/S096554251603009X