Abstract
This paper discusses a distributed design for clustering based on the K-means algorithm in a switching multi-agent network, for the case when data are decentralized stored and unavailable to all agents. The authors propose a consensus-based algorithm in distributed case, that is, the double-clock consensus-based K-means algorithm (DCKA). With mild connectivity conditions, the authors show convergence of DCKA to guarantee a distributed solution to the clustering problem, even though the network topology is time-varying. Moreover, the authors provide experimental results on various clustering datasets to illustrate the effectiveness of the fully distributed algorithm DCKA, whose performance may be better than that of the centralized K-means algorithm.
Similar content being viewed by others
References
Nedic A and Ozdaglar A, Distributed subgradient methods for multi-agent optimization, IEEE Trans. Automatic Control, 2009, 54(1): 48–61.
Lou Y C, Hong Y G, and Shi G D, Target aggregation of second-order multi-agent systems with switching interconnection, Journal of Systems Science and Complexity, 2012, 25(3): 430–440.
Yi P and Hong Y G, Stochastic sub-gradient algorithm for distributed optimization with random sleep scheme, Control Theory and Technology, 2015, 13(4): 333–347.
Lou Y C, Hong Y G, and Wang S Y, Distributed continuous-time approximate projection protocols for shortest distance optimization problems, Automatica, 2016, 69: 289–297.
Liu X Y, Sun J, Dou L H, et al., Leader-following consensus for discrete-time multi-agent systems with parameter uncertainties based on the event-triggered strategy, Journal of Systems Science and Complexity, 2017, 30(1): 30–45.
Fayyad U M, Piatetsky-Shapiro G, Smyth P, et al., Advances in Knowledge Discovery and Data Mining, AAAI Press, Menlo Park, California, 1996.
Basu S, Davidson I, and Wagstaff K, Constrained Clustering: Advances in Algorithms, Theory, and Applications, CRC Press, Boca Raton, USA, 2008.
Jain A K and Dubes R C, Algorithms for Clustering Data, Prentice-Hall, Inc., 1988.
Jain A K, Data clustering: 50 years beyond K-means, Pattern Recognition Letters, 2010, 31(8): 651–666.
Lloyd S, Least squares quantization in PCM, IEEE Trans. Information Theory, 1982, 28(2): 129–137.
Bottou L and Bengio Y, Convergence properties of the K-means algorithms, Advances in Neural Information Processing Systems, 1995, 25(2): 585–592.
Ostrovsky R, Rabani Y, Schulman L J, et al., The effectiveness of lloyd-type methods for the Kmeans problem, Proceedings of the 47th Annual IEEE Symposium on Foundations of Computer Science, Berkeley, 2006.
Gu D B, Distributed em algorithm for gaussian mixtures in sensor networks, IEEE Transactions on Neural Networks, 2008, 19(7): 1154–1166.
Greenhill S and Venkatesh S, Distributed query processing for mobile surveillance, Proceedings of the 15th ACM International Conference on Multimedia, Augsburg, 2007.
Considine J, Li F F, Kollios G, et al., Approximate aggregation techniques for sensor databases, Proceedings of the 20th Conference on Data Engineering, Boston, 2004.
Greenwald M B and Khanna S, Power-conserving computation of order-statistics over sensor networks, Proceedings of the 23rd ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, Madison, 2004.
Lewis F L, Wireless Sensor Networks, John Wiley & Sons, Inc., 2005.
Corbett J C, Dean J, Epstein M, et al., Spanner: Google’s globally distributed database, ACM Tran. Computer Systems, 2013, 31(3): 251–264.
Joshi M N, Parallel K-means algorithm on distributed memory multiprocessors, Computer, 2003, 9: 3–15.
Dhillon I S and Modha D S, A data-clustering algorithm on distributed memory multiprocessors, Large-Scale Parallel Data Mining, Springer, 2002, 245–260.
Hajiee M, A new distributed clustering algorithm based on K-means algorithm, Proceedings of the 3rd International Conference on Advanced Computer Theory and Engineering, Chengdu, 2010.
Vaidya J and Clifton C, Privacy-preserving K-means clustering over vertically partitioned data, Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Washington, 2003.
Jagannathan G and Wright R N, Privacy-preserving distributed K-means clustering over arbitrarily partitioned data, Proceedings of the 11th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Chicago, 2005.
Datta S, Giannella C, and Kargupta H, K-means clustering over a large, dynamic network, Proceedings of the 2006 SIAM International Conference on Data Mining, Bethesda, 2006.
Datta S, Giannella C, and Kargupta H, Approximate distributed K-means clustering over a peer-to-peer network, IEEE Trans. Knowledge and Data Engineering, 2009, 21(10): 1372–1388.
Forero P A, Cano A, and Giannakis G B, Distributed clustering using wireless sensor networks, IEEE Journal of Selected Topics in Signal Processing, 2011, 5(4): 707–724.
Khedr A M and Bhatnagar R K, New algorithm for clustering distributed data using K-means, Computing & Informatics, 2014, 33(4): 943–964.
Oliva G, Setola R, and Hadjicostis C N, Distributed K-means algorithm, arXiv:1312.4176, 2013.
Liu Q H, Fu W M, Qin J H, et al., Distributed K-means algorithm for sensor networks based on multi-agent consensus theory, Proceedings of 2016 IEEE International Conference on Industrial Technology, Taipei, China, 2016.
West D B, Introduction to Graph Theory, 2nd Edition, Prentice Hall, Inc. Upper Saddle River, 2001.
Forero P A, Cano A, and Giannakis G B, Distributed feature-based modulation classification using wireless sensor networks, Proceedings of IEEE Military Communications Conference, San Diego, 2008.
Hartigan J A and Wong M A, Algorithm as 136: A K-means clustering algorithm, Applied Statistics, 1979, 28(1): 100–108.
Arthur D and Vassilvitskii S, K-means++: The advantages of careful seeding, Proceedings of the 18th Annual ACM-SIAM Symposium on Discrete Algorithms, New Orleans, 2007.
Yuan D M, Ho D W, and Xu S Y, Zeroth-order method for distributed optimization with approximate projections, IEEE Trans. Neural Networks and Learning Systems, 2016, 27(2): 284–294.
Franti P, et al., Clustering datasets, 2015, https://doi.org/cs.uef.fi/sipu/datasets/.
Author information
Authors and Affiliations
Corresponding author
Additional information
This research was supported by the National Key Research and Development Program of China under Grant No. 2016YFB0901902 and the National Natural Science Foundation of China under Grant Nos. 61573344, 61333001, 61733018, and 61374168.
This paper was recommended for publication by Editor HU Xiaoming.
Rights and permissions
About this article
Cite this article
Lin, P., Wang, Y., Qi, H. et al. Distributed Consensus-Based K-Means Algorithm in Switching Multi-Agent Networks. J Syst Sci Complex 31, 1128–1145 (2018). https://doi.org/10.1007/s11424-018-7102-3
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11424-018-7102-3