Abstract
A strongly NP-hard problem of partitioning a finite set of points of Euclidean space into two clusters is considered. The solution criterion is the minimum of the sum (over both clusters) of weighted sums of squared distances from the elements of each cluster to its geometric center. The weights of the sums are equal to the cardinalities of the desired clusters. The center of one cluster is given as input, while the center of the other is unknown and is determined as the point of space equal to the mean of the cluster elements. A version of the problem is analyzed in which the cardinalities of the clusters are given as input. A polynomial-time 2-approximation algorithm for solving the problem is constructed.
Similar content being viewed by others
References
A. V. Kel’manov and A. V. Pyatkin, “NP-hardness of some quadratic Euclidean 2-clustering problems,” Dokl. Math. 92 (2), 634–637 (2015).
A. V. Kel’manov and A. V. Pyatkin, “On the complexity of some quadratic Euclidean 2-clustering problems,” Comput. Math. Math. Phys. 56 (3), 491–497 (2015).
A. V. Kel’manov and A. V. Motkova, “Exact pseudopolynomial algorithm for a balanced 2-clustering problem,” J. Appl. Ind. Math. 10 (3), 349–355 (2016).
A. V. Kel’manov and A. V. Motkova, “A fully polynomial-time approximation scheme for a special case of a balanced 2-clustering problem,” Lect. Notes Comput. Sci. 9869, 182–192 (2016).
C. C. Aggarwal, Data Mining (Springer, Berlin, 2015).
M. C. Bishop, Pattern Recognition and Machine Learning (Springer Science, New York, 2006).
T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical Learning: Data Mining, Inference, and Prediction (Springer-Verlag, New York, 2001).
M. R. Garey and D. S. Johnson, Computers and Intractability: A Guide to the Theory of NP-Completeness (Freeman, San Francisco, 1979).
S. Sahni and T. Gonzalez, “P-complete approximation problems,” J. ACM 23, 555–566 (1976).
P. Brucker, “On the complexity of clustering problems,” Lect. Notes Econ. Math. Syst. 157, 45–54 (1978).
M. Inaba, N. Katoh, and H. Imai“Applications of weighted Voronoi diagrams and randomization to variancebased k-clustering (extended abstract),” Proceedings of the 10th ACM Symposium on Computational Geometry (Stony Brook, New York, 1994), pp. 332–339.
S. Hasegawa, H. Imai, M. Inaba, N. Katoh, and J. Nakano“Efficient algorithms for variance-based k-clustering,” Proceedings of the 1st Pacific Conference on Computer Graphics and Applications (Pacific Graphics’93, Seoul, Korea) (World Scientific, New York, 1993), Vol. 1, pp. 75–89.
F. de la Vega and C. Kenyon, “A randomized approximation scheme for metric max-cut,” J. Comput. Sci. 63, 531–541 (2001).
F. de la Vega, M. Karpinski, C. Kenyon, and Y. Rabani“Polynomial time approximation schemes for metric min-sum clustering,” Electronic Colloquium on Computational Complexity (ECCC), Report No. 25 (2002).
R. A. Fisher, Statistical Methods and Scientific Inference (Hafner, New York, 1956).
M. Rao, “Cluster analysis and mathematical programming,” J. Am. Stat. Assoc. 66, 622–626 (1971).
D. Aloise, A. Deshpande, P. Hansen, and P. Popat, “NP-hardness of Euclidean sum-of-squares clustering,” Machine Learning 75 (2), 245–248 (2009).
A. V. Dolgushev and A. V. Kel’manov, “An approximation algorithm for solving a problem of cluster analysis,” J. Appl. Ind. Math. 5 (4), 551–558 (2011).
A. V. Dolgushev, A. V. Kel’manov, and V. V. Shenmaier, “A polynomial-time approximation scheme for a problem of partitioning a finite set into two clusters,” Proc. Steklov Inst. Math. 295, Suppl. 1, 47–56 (2016).
A. V. Kel’manov and V. I. Khandeev, “A 2-approximation polynomial algorithm for a clustering problem,” J. Appl. Ind. Math. 7 (4), 515–521 (2013).
A. V. Kel’manov and V. I. Khandeev, “A randomized algorithm for two-cluster partition of a set of vectors,” Comput. Math. Math. Phys. 55 (2), 330–339 (2015).
A. V. Kel’manov and V. I. Khandeev, “An exact pseudopolynomial algorithm for a problem of the two-cluster partitioning of a set of vectors,” J. Appl. Ind. Math. 9 (4), 497–502 (2015).
A. V. Kel’manov and V. I. Khandeev, “Fully polynomial-time approximation scheme for a special case of a quadratic Euclidean 2-clustering problem,” Comput. Math. Math. Phys. 56 (2), 334–341 (2016).
A. V. Kel’manov and S. M. Romanchenko, “An FPTAS for a vector subset search problem,” J. Appl. Ind. Math. 8 (3), 329–336 (2014).
N. Wirth, Algorithms + Data Structures = Programs (Prentice Hall, New Jersey, 1976).
Author information
Authors and Affiliations
Corresponding author
Additional information
Original Russian Text © A.V. Kel’manov, A.V. Motkova, 2018, published in Zhurnal Vychislitel’noi Matematiki i Matematicheskoi Fiziki, 2018, Vol. 58, No. 1, pp. 136–142.
Rights and permissions
About this article
Cite this article
Kel’manov, A.V., Motkova, A.V. Polynomial-Time Approximation Algorithm for the Problem of Cardinality-Weighted Variance-Based 2-Clustering with a Given Center. Comput. Math. and Math. Phys. 58, 130–136 (2018). https://doi.org/10.1134/S0965542518010074
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1134/S0965542518010074