Abstract
We consider the strongly NP-hard problem of partitioning a set of Euclidean points into two clusters so as to minimize the sum (over both clusters) of the weighted sum of the squared intracluster distances from the elements of the clusters to their centers. The weights of sums are the sizes of the clusters. The center of one cluster is given as input, while the center of the other cluster is unknown and determined as the average value over all points in the cluster (as the geometric center). Two variants of the problems are analyzed in which the cluster sizes are either given or unknown. We present and prove some exact pseudopolynomial algorithms in the case of integer components of the input points and fixed space dimension.
References
A. E. Baburin, E. Kh. Gimadi, N. I. Glebov, and A. V. Pyatkin, “The Problem of Finding a Subset of Vectors with theMaximum TotalWeight,” Diskretn. Anal. Issled. Oper. Ser. 2, 14 (1), 32–42 (2007) [J. Appl. Indust. Math. 2 (1), 32–38 (2008)].
N. Wirth, Algorithms + Data Structures = Programs (PrenticeHall, Upper Saddle River, USA, 1976;Mir, Moscow, 1985).
E. Kh. Gimadi, Yu. V. Glazkov, and I. A. Rykov, “On Two Problems of Choosing Some Subset of Vectors with Integer Coordinates That HasMaximum Norm of the Sum ofElements in EuclideanSpace,” Diskretn. Anal. Issled. Oper. 15 (4), 30–43 (2008) [J. Appl. Indust. Math. 3 (3), 343–352 (2009)].
E. Kh. Gimadi, A. V. Kel’manov, M. A. Kel’manova, and S. A. Khamidullin, “A Posteriori Detecting a Quasiperiodic Fragment with a GivenNumber of Repetitions in aNumerical Sequence,” Sibirsk. Zh. Industr. Mat. 9 (1), 55–74 (2006).
A. V. Dolgushev and A. V. Kel’manov, “An Approximation Algorithm for Solving a Problem of Cluster Analysis, Diskretn. Anal. Issled. Oper. 18 (2), 29–40 (2011) [J. Appl. Indust. Math. 5 (4), 551–558 (2011)].
A. V. Dolgushev, A. V. Kel’manov, and V. V. Shenmaier, “Polynomial-Time Approximation Scheme for a Problem of Partitioning a Finite Set into Two Clusters,” Trudy Inst. Mat. Mekh. 21 (3), 100–109 (2015).
A. V. Kel’manov and A. V. Pyatkin, “NP-Hardness of Some Quadratic Euclidean 2-Clustering Problems,” Dokl. Akad. Nauk 464 (5), 535–538 (2015) [Dokl. Math. 92 (2), 634–637 (2015)].
A. V. Kel’manov and A. V. Pyatkin, “On the Complexity of Some Quadratic Euclidean 2-Clustering Problems,” Zh. Vychisl. Mat. Mat. Fiz. 56 (3), 150–156 (2016) [Comput. Math. Math. Phys. 56 (3), 491–497 (2016)].
A. V. Kel’manov and S. M. Romanchenko, “Pseudopolynomial Algorithms for Certain Computationally Hard Vector Subset and Cluster Analysis Problems,” Avtomat. i Telemekh. No. 2, 156–162 (2012) [Autom. Remote Control 73 (2), 349–354 (2012)].
A. V. Kel’manov and S. M. Romanchenko, “An FPTAS for a Vector Subset Search Problem,” Diskretn. Anal. Issled. Oper. 21 (3), 41–52 (2014) [J. Appl. Indust. Math. 8 (3), 329–336 (2014)].
A. V. Kel’manov and V. I. Khandeev, “A 2-Approximation Polynomial Algorithm for a Clustering Problem,” Diskretn. Anal. Issled. Oper. 20 (4), 36–45 (2013) [J. Appl. Indust. Math. 7 (4), 515–521 (2013)].
A. V. Kel’manov and V. I. Khandeev, “A Randomized Algorithm for Two-Cluster Partition of a Set of Vectors,” Zh. Vychisl. Mat. Mat. Fiz. 55 (2), 335–344 (2015) [Comput. Math. Math. Phys. 55 (2), 330–339 (2015)].
A. V. Kel’manov and V. I. Khandeev, “An Exact Pseudopolynomial Algorithm for a Problem of the Two-Cluster Partitioning of a Set of Vectors,” Diskretn. Anal. Issled. Oper. 22 (3), 36–48 (2015) [J. Appl. Indust. Math. 9 (4), 497–502 (2015)].
D. Aloise, A. Deshpande, P. Hansen, and P. Popat, “NP-Hardness of Euclidean Sum-of-Squares Clustering,” Mach. Learn. 75 (2), 245–248 (2009).
P. Brucker, “On the Complexity of Clustering Problems,” in Optimization and Operations Research: (Proceedings of the Workshop Held at University Bonn, Bonn, Germany, October 2–8, 1977), (Springer, Heidelberg, 1978), pp. 45–54.
W. F. de la Vega, M. Karpinski, C. Kenyon, and Y. Rabani, “Polynomial Time Approximation Schemes for Metric Min-Sum Clustering,” in Electronic Colloquium on Computational Complexity, Report No. 25 (Hasso-Plattner-Institut Softwaresystemtechnik, Potsdam, 2002).
W. F. de la Vega and C. Kenyon, “A Randomized Approximation Scheme for Metric Max-Cut,” J. Comput. Syst. Sci. 63, 531–541 (2001).
R. A. Fisher, Statistical Methods and Scientific Inference (Hafner Press, New York, 1959).
M. R. Garey and D. S. Johnson, Computers and Intractability: A Guide to the Theory of NPCompleteness (Freeman, San Francisco, 1979; Mir, Moscow, 1982).
E. Kh. Gimadi, A. V. Kel’manov, M. A. Kel’manova, and S. A. Khamidullin, “A Posteriori Detecting a Quasiperiodic Fragment in a Numerical Sequence,” Pattern Recognit. Image Anal. 18 (1), 30–42 (2008).
S. Hasegawa, H. Imai, M. Inaba, N. Katoh, and J. Nakano, “Efficient Algorithms for Variance-Based k-Clustering,” in Computer Graphics and Applications: Proceedings of the 1st Pacific Conference on Computer Graphics and Applications (Seoul, Korea, August 30–September 2, 1993), Vol. 1 (World Scientific, River Edge, NJ, USA, 1993), pp. 75–89.
M. Inaba, N. Katoh, and H. Imai, “Applications of Weighted Voronoi Diagrams and Randomization to Variance-Based k-Clustering,” in Proceedings of the 10th Symposium on Computational Geometry, Stony Brook, NY, USA, June 6–8, 1994 (ACM, New York, 1994), pp. 332–339.
M. R. Rao, “Cluster Analysis and Mathematical Programming,” J. Am. Stat. Assoc. 66, 622–626 (1971).
S. Sahni and T. Gonzalez “P-Complete Approximation Problems,” J. ACM. 23, 555–566 (1976).
Author information
Authors and Affiliations
Corresponding author
Additional information
ginal Russian Text © A.V. Kel’manov, A.V. Motkova, 2016, published in Diskretnyi Analiz i Issledovanie Operatsii, 2016, Vol. 23, No. 3, pp. 21–34.
Rights and permissions
About this article
Cite this article
Kel’manov, A.V., Motkova, A.V. Exact pseudopolynomial algorithms for a balanced 2-clustering problem. J. Appl. Ind. Math. 10, 349–355 (2016). https://doi.org/10.1134/S1990478916030054
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1134/S1990478916030054