Using a Set of Triangle Inequalities to Accelerate Kmeans Clustering
 261 Downloads
Abstract
The kmeans clustering is a wellknown problem in data mining and machine learning. However, the de facto standard, i.e., Lloyd’s kmean algorithm, suffers from a large amount of time on the distance calculations. Elkan’s kmeans algorithm as one prominent approach exploits triangle inequality to greatly reduce such distance calculations between points and centers, while achieving the exactly same clustering results with significant speed improvement, especially on highdimensional datasets. In this paper, we propose a set of triangle inequalities to enhance the filtering step of Elkan’s kmeans algorithm. With our new filtering bounds, a filteringbased Elkan (FBElkan) is proposed, which preserves the same results as Lloyd’s kmeans algorithm and additionally prunes unnecessary distance calculations. In addition, a memoryoptimized Elkan (MOElkan) is provided, where the space complexity is greatly reduced by tradingoff the maintenance of lower bounds and the runtime efficiency. Throughout evaluations with realworld datasets, FBElkan in general accelerates the original Elkan’s kmeans algorithm for highdimensional datasets (up to 1.69x), whereas MOElkan outperforms the others for lowdimensional datasets (up to 2.48x). Specifically, when the datasets have a large number of points, i.e., \(n\ge 5\)M, MOElkan still can derive the exact clustering results, while the original Elkan’s kmeans algorithm is not applicable due to memory limitation.
Keywords
Kmeans Clustering accelerating Triangle inequalities1 Introduction
The kmeans clustering is one of the popular problems in data mining and machine learning due to its simplicity and applicability. The de facto kmeans algorithm, i.e., Lloyd’s kmeans algorithm [12], performs two steps repeatedly: 1) the assignment step matches each point to its closest center, and 2) the update step calibrates the center for each cluster with the assigned points. However, the bottleneck in terms of time complexity, is to identify the closest center for each input data point, which leads to significantly high time complexity, i.e., O(nkd), where n is the number of data points, k is the number of centers and d is the number of dimensions. In many situations, those numbers are big, e.g., data on health status of patients, on earth observation, on computer vision, etc. Therefore, efficient kmean clustering algorithms are indeed desired.
In order to accelerate the kmeans algorithm, two distinctive categories are widelystudied in the literature. 1) Approximated solution: Instead of accelerating the exact kmeans algorithm, the proposed techniques in this category perform approximated solutions, e.g., [15, 17, 18], which indeed accelerate kmeans algorithms, but the final clustering results cannot be guaranteed to be the same as Lloyd’s kmeans algorithm. 2) Acceleration with exact results: The proposed techniques in this category accelerate the calculation procedure while preserving the exact results as Lloyd’s kmeans algorithm. For example, Kanungo et al. [11] and Pelleg et al. [14] propose to accelerate the nearest neighbor search without computing distances to all k centers by using the properties of special data structures. However, the overhead of preprocessing becomes significant when the input datasets are highdimensional. Alternatively, several acceleration techniques exploit bounds on distance between data points and centers, e.g., [3, 5, 7, 8, 10, 13, 16]. By maintaining lower and upper bounds on the distances to the cluster centers, most of distance calculations can be skipped. In particular, Elkan’s kmeans algorithm [8] as one prominent approach of them can still dominate the others on highdimensional datasets [13]. Nevertheless, Elkan’s kmeans algorithm is apparently infeasible when the number of data point (n) or centers (k) is large due to the size of memory footprint for storing the lower bounds, where the space complexity is O(nk).^{1}

Three filtering bounds are proposed based on triangle inequalities to overcome shortcomings of Elkan’s kmeans algorithm, by which the most unnecessary distance calculations between points and centers during the iterations of Elkan’s kmeans algorithm can be pruned (see Sect. 4).

We present how to optimize the original Elkan’s kmeans algorithm to alleviate the time and space overheads by applying above filtering bounds. Two optimized algorithms are proposed: runtime optimized Elkan (FBElkan) and memory optimized Elkan (MOElkan). Specifically, the MOElkan has the space complexity \(O(n+k^2+kd)\), whereas Elkan’s kmeans algorithm requires \(O(nk+kd)\), where n is number of the input data points, d is number of dimensions and k is number of clusters (see Sect. 5).

Throughout evaluation we show that FBElkan is faster than the original Elkan’s kmeans algorithm on highdimensional datasets in general, whereas MOElkan can outperforms the others on lowdimensional datasets considerably. Specifically, MOElkan can derive the exact clustering results when the number of data points is large, i.e., \(n = 5\)M while the original one and FBElkan may not be applicable due to memory limitation. (see Sect. 6).
The rest of this paper is organized as follows: In Sect. 2, we review related work regarding the boundbased accelerated algorithms with the exactly same clustering results as the standard (Lloyd’s) kmeans algorithm. Section 3 defines the notation used in this paper and presents a short, general overview of Elkan’s kmeans algorithm as we use it as a backbone. Section 4 presents our new filtering conditions. In Sect. 5 we discuss how to use the proposed bounds to optimize the original Elkan’s kmeans algorithm. In Sect. 6, extensive evaluation results and discussions on different realworld datasets are presented. Finally, we conclude the paper in Sect. 7.
2 Related Work

Elkan’s kmeans algorithm [8] takes advantage of lower bounds and upper bounds to reduce the redundant distance calculations.

Hamerly in [10] proposes to keep only one lower bound on the distance between each point and its second closest center instead of keeping lower bounds per point. Actually, it is a simplified version of Elkan’s kmeans algorithm, but it is more efficient for lowdimensional datasets.

Drake and Hamerly [7] extend the above approach [10] to keep a variable number of lower bounds, which is automatically adjusted on the fly. Drake later on proposes Annulus algorithm in [6] to prune the search space for each point by annular region.

Yinyang kmeans algorithm [5] groups a number of cluster centers, which balances the time of filtering and the time of distance calculations.

Fast Yinyang kmeans algorithm [3] further proposes to approximate Euclidean distances by using block vectors, which can achieve good improvements when the dimension of data is high.

Newling and Fleuret in [13] simplify Yinyang and Elkan kmeans algorithms and provide tighter upper and lower bounds for updating. They also propose an Exponion algorithm, which improves Yinyang and Elkan’s kmean algorithms for lowdimensional datasets.

Ryšavý and Hamerly in [16] propose a few methods to accelerate all aforementioned algorithms, such as producing tighter lower bounds, finding neighbor centers and accelerating kmeans in the first iteration.

FissionFusion kmeans algorithm [19] keeps bounds for subgroups of clusters. It performs better for lowdimensional datasets.
Elkan’s kmeans algorithm is known to suffer from the required space complexity O(nk) to store the lower bounds, which may be infeasible for large k, demonstrated in [5, 13]. However, Elkan’s kmeans algorithm performs the best in terms of runtime among the aforementioned accelerated kmeans algorithms, for highdimensional datasets, as shown in [13], e.g., Gassensor \((d = 128)\), KDDcup98 \((d = 310)\), and MNIST784 \((d = 784)\).^{2} Therefore, we are motivated to continue this same vain to make Elkan’s kmeans algorithm even faster or with less memory footprint to improve the scalability.
3 Kmeans Clustering and Elkan’s Kmeans Algorithm
For kmeans clustering, we are given a positive integer k and a set \(\mathbf{X} \) of n ddimensional data points. The objective is to partition data points in \(\mathbf{X} \) into k clusters while minimizing withincluster variances, which are defined as Euclidean distance between each data point and the center of the cluster it belongs to. In this paper, we use \(t=0,1,2,\ldots \) to identify the discrete iterations, and each of the given data points in \(\mathbf{X} \) is classified into one of the k clusters in each iteration t. Specifically, Elkan’s kmeans algorithm [8] accelerates Lloyd’s kmeans algorithm using triangle inequality.
We use \(C_i(t)\) to denote the set of data points that are classified into the ith cluster at the end of the tth iteration. The ith cluster at the end of the tth iteration is defined by its cluster center \(c_i(t)\). A data point x is classified into the cluster \(C_i(t)\) if the Euclidean distance between the data point x and the cluster center is the shortest among all cluster centers. That is, \(x \in C_i(t)\) if \(\delta (x, c_i(t)) \le \delta (x, c_j(t))\), ties being broken arbitrarily, where \(\delta (x, y)\) is the Euclidean distance between two points x and y. For any t, we have \(\cup _{i=1}^{k} C_i(t) = \mathbf{X} \) and \(C_i(t) \cap C_j(t) = \emptyset \) when \(i\ne j\). In this paper, we assume that calculation of the distance of any two points can be done in O(d) time complexity and O(1) space complexity.
Initially, when \(t=0\), k seeds are chosen as the initial cluster centers and each of the data points in \(\mathbf{X} \) is classified into one of the k clusters. At the beginning of the next iteration, i.e., \(t+1\), the ith cluster center is positioned to \(c_i(t+1)\) by calculating the means of the data points in \(C_i(t)\). The shift of the cluster center is \(\delta (c_i(t), c_i(t+1))\). To update the clustering at the end of the tth iteration, the time complexity is O(nkd) in the above procedure.

An upper bound \(ub(x, c_i(t))\) to the cluster center \(c_i(t)\) when x is classified into the ith cluster, i.e., \(x \in C_i(t)\).

\(k1\) lower bounds \(lb(x, c_j(t))\) to the other cluster centers \(c_j(t)\) for any \(x \notin C_j(t)\).

\(ub\left( x, c_i(t+1)\right) \) to \(ub\left( x, c_i(t)) + \delta (c_i(t), c_i(t+1)\right) \) and

\(lb(x, c_j(t+1))\) to \(lb(x, c_j(t))  \delta (c_j(t), c_j(t+1))\) for any \(j \ne i\).
Elkan [8] proves that a data point x in cluster \(C_i(t)\) is not going to be assigned to another cluster \(C_j(t+1)\) in the following lemma.
Lemma 1
We note that a significant drawback of Elkan’s kmeans algorithm is that the space complexity is O(nk) due to the storage of the lower bounds, in addition to the O(nd) input data. The algorithm may not be applicable when nk (or even k) is sufficiently large.
4 New Filtering Bounds
Although Elkan’s kmeans algorithm can greatly avoid unnecessary distance calculations, its has two shortcomings. First, for an iteration, i.e., fixed t, solely applying Eq. (1) to decide the impossibility of relocating a data point to another cluster can be inefficient. In Sect. 4.1, we propose a simple condition, which can be used to filter out centers that no data in \(C_i(t)\) will be relocated to, at the end of the \((t+1)\)th iteration. Moreover, the maintained lower bounds \(lb(x, c_j(t))\) can be very expensive, i.e., it requires O(nk) space complexity, and even become too inaccurate in some scenarios. In Sect. 4.2, we present two new lower bounds that can be independently applied to improve the space complexity and inaccuracy.
4.1 Filtering for Clusters of Points
The following theorem provides a new filtering condition to ensure that a point that is not assigned to a cluster \(C_j(t)\) is not assigned to another cluster \(C_j(t+1)\) at the end of the \((t+1)\)th iteration as well.
Theorem 1
Proof
The condition in Eq. (3) is always ensured by applying the original Elkan’s kmeans algorithm as this property is ensured by Lemma 1. The difference here is to apply a tighter bound if the condition in Eq. (4) holds. This theorem is useful when the distance \(\delta \left( c_i(t+1), c_j(t+1) \right) \) is larger than \(\delta \left( c_i(t), c_j(t) \right) \).
Corollary 1
Suppose that t is a nonnegative integer and the upper bound on the Euclidean distance of every data point in cluster \(C_i(t)\) is at most \(UB_i(t)\), i.e., \(UB_i(t) = \max _{x \in C_i(t)}ub(x,c_i(t))\). If \(UB_i(t) < \frac{1}{2} \delta (c_i(t), c_j(t))\) and the condition in Eq. (4) holds \(\forall x \in C_i(t)\), then none of the data points in cluster \(C_i(t)\) is going to be classified into another data cluster \(C_j(t+1)\) at the end of the \((t+1)\)th iteration.
Proof
This comes directly from Theorem 1.
4.2 Additional Lower Bounds
In Elkan’s kmeans algorithm, to ensure the impossibility that a data point x in \(C_i(t)\) is going to be classified into a new cluster \(C_j(t+1)\) for some \(j \ne i\) in the next iteration is to make sure that the upper bound of the distance \(\delta (x, C_i(t+1))\) is no more than the lower bound of the distance \(\delta (x, C_j(t+1))\). To ensure that, \(lb(x, c_j(t))  \delta (c_j(t), c_j(t+1))\) is used as a lower bound of \(\delta (x, C_j(t+1))\), as stated in Eq. (2) in Lemma 1.
However, this lower bound becomes very small if the shift of the jth center is significant. In fact, when \(\delta (c_j(t), c_j(t+1)\) is large, it is possible to find a tighter (i.e., larger) lower bound of \(\delta (x, C_j(t+1))\), as presented in the following theorem:
Theorem 2
Proof
Moreover, the lower bound \(lb(x, c_j(t))\) may be not available if we do not want to keep tracking the distance between x and the other \(k1\) cluster centers that x does not belong to. In fact, if \(c_i(t)\) and \(c_j(t)\) are quite distant, the lower bound in the following lemma can be applied:
Theorem 3
Proof
We note that the two new lower bounds introduced in Theorems 2 and 3 only require the information of \(ub(x, c_i(t))\) and distances of the cluster centers. Therefore, they can be used to reduce the space complexity when maintaining the lower bounds \(lb(x, c_j(t)), \forall x \in \mathbf{X} \) and \(x \ne C_j(t)\) is too expensive, i.e., O(nk), detailed in Sect. 5.
5 Optimized Elkan’s Kmeans
Algorithm 1 presents the pseudocode of our optimized algorithms. After the initialization (Line 3), the clustering procedure keeps repeating until the process converges, i.e., all centers stop changing. If Eq. (2) is not used in the algorithm (in Line 27), we can skip the maintenance of the lower bounds in Lines 13, 18, and 35. The pseudocode consists of two procedures, one for the initialization when t is 0 (i.e., Line 8 to Line 13) and one for the \(t'\leftarrow (t+1)\)th iteration (i.e., Line 14 to Line 35). We focus our explanation on the latter procedure.
Line 15 updates each of the k centers by calculating the Euclidean mean value of the points assigned to the cluster in the previous iteration. Line 16 calculates different distances between different centers in the last iteration t and in this iteration \(t'=t+1\). Line 17 updates the upper bound of the distance from x to its shifted center \(c_i(t+1)\) by applying a triangle inequality. The time complexity of the above steps is \(O((n+k^2)d)\) and the space complexity is \(O(n+k^2+kd)\).
Moreover, Line 18 updates the lower bounds of the distance from x to other centers with \(x \ne C_j(t)\) using a triangle inequality if necessary. Line 18 requires O(nk) space and time complexity.

We can apply Eq. (1), Eq. (2), and Eq. (5). For a given x and j, each of them takes O(1) time/space complexity. Line 29 takes O(d) time complexity. However, this requires the lower bounds maintained in Lines 13, 18, and 35. We denote this option as filteringbased Elkan, FBElkan.

We can apply Eq. (1), Eq. (5), and Eq. (6). For a given x and j, this takes O(1) time/space complexity. Line 29 takes O(d) time complexity. This combination does not require the lower bounds maintained in Lines 13, 18, and 35. We denote this option as memoryoptimized Elkan, MOElkan.
In the pseudocode, for the simplicity of presentation, we use an auxiliary set Temp to store the indexes of the possible new centers for a data point x, which are maintained in Line 24 and Line 30. The data point x is assigned to the closest center in Lines 31 to 35. The time complexity between Line 25 and Line 35 is \(O(\text{ Set}_id) = O(kd)\) and the space complexity is O(k). Please note that Temp is just introduced for better readability in the pseudocode. A simple implementation regarding Temp can directly calculate and store the closest index \(j^*\) on the fly using a buffer (instead of calculating the distance again in Line 32).

FBElkan: time complexity O(nkd) and space complexity \(O(nk+kd)\).

MOElkan: time complexity O(nkd) and space complexity \(O(n+k^2+kd)\).
We note that the above time complexity analysis is asymptotic and does not reflect the actual runtime efficiency of these two algorithms and the lower bound in Elkan’s kmeans algorithm, i.e., Eq. (2), is usually stronger than Eq. (6). Therefore, if the space complexity is affordable, using Eq. (2) is more runtime efficient than using Eq. (6), further explained in Sect. 6.
6 Evaluation and Discussion
In this section, we first present our evaluation setup. Afterwards, we present the evaluation results of normalized speedup. Specifically, we show the scalability of MOElkan on large n datasets, i.e., SUSY and HIGGS. Please note that the nsbounds provided in [13] can also be included in our algorithms, but we decide not to involve them here due to the page limit.
6.1 Evaluation Setup
We compared two optimized Elkan’s kmeans algorithms with the original Elkan’s kmeans algorithm (denoted as Elkan) [8]: FBElkan represents the combination of Eq. (1), Eq. (2), and Eq. (5) in Algorithm 1. MOElkan represents the combination of Eq. (1), Eq. (5), and Eq. (6) in Algorithm 1. The presented speedup factors are all normalized according to Elkan. If the normalized value is greater than 1, the considered algorithm is faster than Elkan. Otherwise, it is slower than Elkan.
Speedup normalized to Elkan and variances with highdimensional datasets. For the simplicity of the presentation, the shown variance is set to 0 if the calculated value is less than \(10^{4}\).
Dataset  n  d  k  MOElkan  Variance  FBElkan  Variance 

Covtype  150000  54  10 50 100 500  0.33 0.44 0.56 0.67  0.006 0.49 0.41 4.07  1.14 1.18 1.19 0.70  0.0004 0.04 0.23 1.23 
KDDcup98  95412  56  10 50 100 500  0.32 0.39 0.49 0.53  0.008 0.12 1.28 6.33  1.40 1.18 1.09 0.83  0.004 0.017 0.22 0.25 
KDDcup04  145751  74  10 50 100 500  0.26 0.25 0.24 0.18  0.43 19.81 200.13 339.76  1.05 1.10 1.09 1.17  0.005 0.60 2.35 8.71 
Gassenor  14000  128  10 50 100 500  0.36 0.37 0.43 0.47  0 0.002 0.002 0.049  1.25 1.31 1.22 1.06  0 0.0003 0.0009 0.07 
Usps  7291  256  10 50 100 500  0.33 0.27 0.39 0.56  0.003 0.05 0.12 2.28  1.11 1.48 1.69 1.28  0.0003 0.002 0.015 0.68 
MNIST784  60000  784  10 50 100 500  0.57 0.3 0.43 0.45  0.088 1.36 9.69 15.73  1.19 1.28 1.10 1.38  0.003 0.15 0.13 1.23 
Speedup normalized to Elkan and variances with lowdimensional datasets. For the simplicity of the presentation, the shown variance is set to 0 if the calculated value is less than \(10^{4}\).
Dataset  n  d  k  MOElkan  Variance  FBElkan  Variance 

birth  100000  2  10 50 100 500  1.03 1.64 1.90 1.69  0.0001 0.0006 0.018 0.038  0.94 0.92 0.90 0.89  0 0.011 0.0006 0.008 
skin_noneskin  245057  3  10 50 100 500  0.95 1.42 1.43 1.49  0 0.002 0.0036 0.61  0.90 0.92 0.93 0.95  0 0.002 0.0036 0.078 
3D_spatial_network  434874  4  10 50 100 500  1.36 2.30 2.48 1.27  0 0 0.001 0.005  0.91 0.86 0.91 0.93  0 0 0.004 0.001 
6.2 Runtime Efficiency Evaluation
With highdimensional datasets (see Table 1), FBElkan can mostly outperform the others and achieve up to 1.69x. The trends of variances also follow the increase of k for each dataset. However, when the number of clusters k is as large as 500, we observe that the benefit of filtering routines, i.e., avoiding unnecessary distance calculations, is mitigated by the overhead of calculating the filtering bounds. For Covtype dataset, the additional time for calculating Eq. 5 increases from \(11\%\) to over \(20\%\) when k increases from 50 to 500, whereas the original Elkan’s kmeans algorithm has no such overhead.
For lowdimensional datasets (see Table 2), we can notice that the variance of the measured results is almost negligible. Moreover, MOElkan can reach up to 2.48x, whereas FBElkan performs slightly worse than Elkan. In fact, the overhead of checking additional filtering bounds in FBElkan is higher than the benefit of filtering unnecessary distance calculations. With a similar reason, MOElkan only requires less memory accesses to the filtering bounds. Therefore, it is faster than Elkan for such datasets.
6.3 Scalability Evaluation
Speedup normalized to Elkan with large n datasets.
Dataset  n  d  k  MOElkan  FBElkan 

SUSY  5M  18  5 10 50 100 500  0.19 0.15 0.20 0.28 v  1.17 1.05 0.96 0.95 – 
HIGGS  11M  28  5 10 50 100 500  0.14 0.11 0.07 v v  1.10 1.08 1.10 – – 
7 Conclusion and Outlook
In this paper, we present new filtering bounds to optimize Elkan’s kmeans algorithm. Specifically, two different combinations of the proposed bounds are proposed to either filter more unnecessary distance calculations (FBElkan), or reduce the space complexity (MOElkan) to improve the scalability of the original Elkan’s kmeans algorithm. Throughout extensive evaluations with several realworld datasets, we reach the conclusion that FBElkan improves the runtime efficiency of Elkan for highdimensional datasets and MOElkan outperforms the others for lowdimensional datasets while improving the scalability of Elkan, i.e., the memory footprint is mainly dominated by the number of data points.
In the future work, we plan to integrate the proposed filtering bounds into other boundsbased accelerated kmeans algorithms. For example, an integration with FissionFusion kmeans algorithm [19] may additionally refine the bounds not only for each data point but also for each cluster. Integrating our bounds with Yingyang [5] and Fast Yingyang kmeans algorithms [3], can be expected to greatly reduce computation time of distance calculations.
Footnotes
 1.
The O(nd) space complexity of the input points is ignored in our complexity analysis.
 2.
In fact, Elkan’s kmeans algorithm using the nsbounds derived from the norm of a sum in [13] sometimes outperforms the original Elkan’s kmeans algorithm.
 3.
Due to the amount of required time for each test, we can reach this number for all setups to fairly demonstrate the statistical significance of the differences.
Notes
Acknowledgement
We thank our colleague Mr. Mikail Yayla for his precious comments at early stages. This paper has been supported by Deutsche Forschungsgemeinschaft (DFG, German Research Foundation), as part of the Collaborative Research Center (SFB 876), “Providing Information by ResourceConstrained Analysis” (project number 124020371), project A1 (http://sfb876.tudortmund.de).
References
 1.Arthur, D., Vassilvitskii, S.: Kmeans++: The advantages of careful seeding. In: Proceedings of the Eighteenth Annual ACMSIAM Symposium on Discrete Algorithms, SODA 2007, pp. 1027–1035. Society for Industrial and Applied Mathematics (2007)Google Scholar
 2.Bache, K., Lichman, M.: UCI machine learning repository (2013). http://archive.ics.uci.edu/ml
 3.Bottesch, T., Bühler, T., Kächele, M.: Speeding up kmeans by approximating euclidean distances via block vectors. In: Proceedings of the 33rd International Conference on International Conference on Machine Learning, ICML 2016, vol. 48, pp. 2578–2586. JMLR.org (2016)Google Scholar
 4.Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2(3), 1–27 (2011)CrossRefGoogle Scholar
 5.Ding, Y., Zhao, Y., Shen, X., Musuvathi, M., Mytkowicz, T.: Yinyang kmeans: a dropin replacement of the classic kmeans with consistent speedup. In: Proceedings of the 32nd International Conference on International Conference on Machine Learning, ICML 2015, vol. 37, pp. 579–587. JMLR.org (2015)Google Scholar
 6.Drake, J.: Faster kmeans Clustering. Master Thesis in Baylor University (2013)Google Scholar
 7.Drake, J., Hamerly, G.: Accelerated kmeans with adaptive distance bounds. In: 5th NIPS Workshop on Optimization for Machine Learning (2012)Google Scholar
 8.Elkan, C.: Using the triangle inequality to accelerate kmeans. In: Proceedings of the Twentieth International Conference on International Conference on Machine Learning, ICML 2003, pp. 147–153. AAAI Press (2003)Google Scholar
 9.Fränti, P., Sieranoja, S.: Kmeans properties on six clustering benchmark datasets (2018). http://cs.uef.fi/sipu/datasets/
 10.Hamerly, G.: Making kmeans even faster. In: SDM, pp. 130–140 (2010)Google Scholar
 11.Kanungo, T., Mount, D.M., Netanyahu, N.S., Piatko, C.D., Silverman, R., Wu, A.Y.: An efficient kmeans clustering algorithm: analysis and implementation. IEEE Trans. Pattern Anal. Mach. Intell. 24, 881–892 (2002)CrossRefGoogle Scholar
 12.Lloyd, S.: Least squares quantization in PCM. IEEE Trans. Inf. Theor. 28(2), 129–137 (2006). https://doi.org/10.1109/TIT.1982.1056489
 13.Newling, J., Fleuret, F.: Fast kmeans with accurate bounds. In: Balcan, M.F., Weinberger, K.Q. (eds.) Proceedings of the 33rd International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 48, 20–22 Jun 2016, New York, USA, pp. 936–944Google Scholar
 14.Pelleg, D., Moore, A.: Accelerating exact kmeans algorithms with geometric reasoning. In: Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 1999, pp. 277–281. Association for Computing Machinery, New York (1999). https://doi.org/10.1145/312129.312248
 15.Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Object retrieval with large vocabularies and fast spatial matching. In: 2007 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8 (2007)Google Scholar
 16.Ryšavý, P., Hamerly, G.: Geometric methods to accelerate kmeans algorithms. In: Proceedings of the 2016 SIAM International Conference on Data Mining, pp. 324–332 (2016)Google Scholar
 17.Sculley, D.: Webscale kmeans clustering. In: Proceedings of the 19th International Conference on World Wide Web, pp. 1177–1178. Association for Computing Machinery, New York (2010)Google Scholar
 18.Wang, J., Wang, J., Ke, Q., Zeng, G., Li, S.: Fast approximate kmeans via cluster closures. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3037–3044 (2012)Google Scholar
 19.Yu, Q., Dai, B.R.: Accelerating KMeans by grouping points automatically. In: Bellatreche, L., Chakravarthy, S. (eds.) DaWaK 2017. LNCS, vol. 10440, pp. 199–213. Springer, Cham (2017). https://doi.org/10.1007/9783319642833_15CrossRefGoogle Scholar