DOOR 2016: Discrete Optimization and Operations Research pp 171-181 | Cite as
An Approximation Algorithm for a Problem of Partitioning a Sequence into Clusters with Restrictions on Their Cardinalities
Abstract
We consider the problem of partitioning a finite sequence of points in Euclidean space into a given number of clusters (subsequences) minimizing the sum of squared distances between cluster elements and the corresponding cluster centers. It is assumed that the center of one of the desired clusters is the origin, while the centers of the other clusters are unknown and determined as the mean values over clusters elements. Additionally, there are a few structural restrictions on the elements of clusters with unknown centers: (1) clusters form non-overlapping subsequences of the input sequence, (2) the difference between two consecutive indices is bounded from below and above by prescribed constants, and (3) the total number of elements in these clusters is given as an input. It is shown that the problem is strongly NP-hard. A 2-approximation algorithm which runs in polynomial time for a fixed number of clusters is proposed for this problem.
Keywords
Clustering Structural constraints Euclidean space Minimum sum-of-squared distances NP-hardness Guaranteed approximation factorNotes
Acknowledgments
This work was supported by Russian Science Foundation, project no. 16-11-10041.
References
- 1.Fu, T.: A review on time series data mining. Eng. Appl. Artif. Intell. 24(1), 164–181 (2011)CrossRefGoogle Scholar
- 2.Kuenzer, C., Dech, S., Wagner, W.: Remote Sensing Time Series. Remote Sensing and Digital Image Processing, vol. 22. Springer, Switzerland (2015)Google Scholar
- 3.Warren Liao, T.: Clustering of time series data – a survey. Pattern Recogn. 38(11), 1857–1874 (2005)CrossRefMATHGoogle Scholar
- 4.Aggarwal, C.C.: Data Mining: The Textbook. Springer, Switzerland (2015)CrossRefMATHGoogle Scholar
- 5.Kel’manov, A.V., Pyatkin, A.V.: On complexity of some problems of cluster analysis of vector sequences. J. Appl. Ind. Math. 7(3), 363–369 (2013)MathSciNetCrossRefMATHGoogle Scholar
- 6.Kel’manov, A.V., Khamidullin, S.A.: An approximating polynomial algorithm for a sequence partitioning problem. J. Appl. Ind. Math. 8(2), 236–244 (2014)MathSciNetCrossRefMATHGoogle Scholar
- 7.Kel’manov, A.V., Mikhailova, L.V.: Joint detection of a given number of reference fragments in a quasi-periodic sequence and its partition into segments containing series of identical fragments. Comput. Math. Math. Phys. 46(1), 165–181 (2006)MathSciNetCrossRefMATHGoogle Scholar
- 8.Kel’manov, A.V., Khamidullin, S.A., Khandeev, V.I.: An exact pseudopolynomial algorithm for a sequence bi-clustering problem (in Russian). In: Book of Abstract of the XVth Russian Conference “Mathematical Programming and Applications”, pp. 139–140. Inst. Mat. Mekh. UrO RAN, Ekaterinburg (2015)Google Scholar
- 9.Kel’manov, A.V., Khamidullin, S.A., Khandeev, V.I.: A fully polynomial-time approximation scheme for a sequence 2-cluster partitioning problem. J. Appl. Indust. Math. 10(2), 209–219 (2016)CrossRefMATHGoogle Scholar
- 10.Kel’manov, A.V., Romanchenko, S.M.: An FPTAS for a vector subset search problem. J. Appl. Indust. Math. 8(3), 329–336 (2014)MathSciNetCrossRefMATHGoogle Scholar