Skip to main content

Recent Advances on Distributed Unsupervised Learning

  • Conference paper
  • First Online:
Book cover Advances in Neural Networks (WIRN 2015)

Part of the book series: Smart Innovation, Systems and Technologies ((SIST,volume 54))

Included in the following conference series:

Abstract

Distributed machine learning is a problem of inferring a desired relation when the training data is distributed throughout a network of agents (e.g. sensor networks, robot swarms, etc.). A typical problem of unsupervised learning is clustering, that is grouping patterns based on some similarity/dissimilarity measures. Provided they are highly scalable, fault-tolerant and energy efficient, clustering algorithms can be adopted in large-scale distributed systems. This work surveys the state-of-the-art in this field, presenting algorithms that solve the distributed clustering problem efficiently, with particular attention to the computation and clustering criteria.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Balcan, M., Ehrlich, S., Liang, Y.: Distributed k-means and k-median clustering on general topologies. Adv. Neural Inf. Process. Syst. 26, 1995–2003 (2013)

    Google Scholar 

  2. Charalambous, C., Cui, S.: A bio-inspired distributed clustering algorithm for wireless sensor networks. In: Proceedings of the 4th Annual Int. Conf. on Wireless Internet (WICON’08) (2008)

    Google Scholar 

  3. Ester, M., Kriegel, H., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of International Conference on Knowledge Discovery and Data Mining (KDD), vol. 96, pp. 226–231 (1996)

    Google Scholar 

  4. Eyal, I., Keidar, I., Rom, R.: Distributed data clustering in sensor networks. Distrib. Comput. 24(5), 207–222 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  5. Forero, P., Cano, A., Giannakis, G.: Consensus-based distributed expectation-maximization algorithm for density estimation and classification using wireless sensor networks. In: Proceedings of ICASSP. pp. 1989–1992 (2008)

    Google Scholar 

  6. Forestier, G., Gançarski, P., Wemmert, C.: Collaborative clustering with background knowledge. Data Knowl. Eng. Arch. 69(2), 211–228 (2010)

    Article  MATH  Google Scholar 

  7. Gançarski, P.: Remote sensing image interpretation. http://omiv2.u-strasbg.fr/imagemining/documents/IMAGEMINING-Gancarski-Multistrategy.pdf. Accessed 04 April 2015

  8. Ghahramani, Z.: Unsupervised Learning. In: Lecture Notes in Computer Science, vol. 3176, pp. 72–112. Springer (2004)

    Google Scholar 

  9. Ghanem, S., Kechadi, T., Tari, A.: New approach for distributed clustering. In: Proceedings of IEEE International Conference on Spatial Data Mining and Geographical Knowledge Services (ICSDM). pp. 60–65 (2011)

    Google Scholar 

  10. Gu, D.: Distributed EM algorithm for Gaussian mixtures in sensor networks. IEEE Trans. Neural Netw. 19(7), 1154–1166 (2008)

    Article  Google Scholar 

  11. Hartigan, J., Wong, M.: Algorithm AS 136: a k-means clustering algorithm. J. R. Stat. Soc. Series C (Appl. Stat.) 28(1), 100–108 (1979)

    Google Scholar 

  12. Hore, P., Hall, L., Goldgof, D.: A scalable framework for cluster ensembles. Pattern Recognit. 42(5), 676–688 (2009)

    Article  MATH  Google Scholar 

  13. Januzaj, E., Kriegel, H., Pfeifle, M.: Towards effective and efficient distributed clustering. In: Proceedings of Workshop on Clustering Large Data Sets (ICDM). pp. 49–58 (2003)

    Google Scholar 

  14. Januzaj, E., Kriegel, H., Pfeifle, M.: Scalable density-based distributed clustering. In: Proceedings of European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD). pp. 231–244 (2004)

    Google Scholar 

  15. Kantabutra, S., Couch, A.: Parallel k-means clustering algorithm on NOWs. MedTec Tech. J. 1(6), 243–248 (2000)

    Google Scholar 

  16. Khac, N., Aouad, L., Kechadi, T.: A new approach for distributed density based clustering on grid platform. In: Lecture Notes in Computer Science, vol. 4587, pp. 247–258. Springer (2007)

    Google Scholar 

  17. Klusch, M., Lodi, S., Moro, G.: Distributed clustering based on sampling local density estimates. In: Proceedings of the Int. Joint Conference on Artificial Intelligence (IJCAI’03). pp. 485–490 (2003)

    Google Scholar 

  18. Laird, N., Dempster, A.P., Rubin, D.: Maximum likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc. Series B 39(1), 1–38 (1977)

    Google Scholar 

  19. Li, L., Tang, J., Ge, B.: K-DmeansWM: an effective distributed clustering algorithm based on P2P. Comput. Sci. 37(1), 39–41 (2010)

    Google Scholar 

  20. Liang, Y., Balcan, M., Kanchanapally, V.: Distributed PCA and k-means clustering. In: The Big Learning Workshop at NIPS (2013)

    Google Scholar 

  21. Mimaroglu, S., Erdil, E.: Combining multiple clusterings using similarity graph. Pattern Recognit. 44, 694–703 (2011)

    Article  MATH  Google Scholar 

  22. Nguyen, N., Caruana, R.: Consensus clustering. In: Proceedings of IEEE International Conference on Data Mining. pp. 607–612 (2006)

    Google Scholar 

  23. Ni, W., Chen, G., Wu, Y.: Local density based distributed clustering algorithm. J. Softw. pp. 2339–2348 (2008)

    Google Scholar 

  24. Nowak, R.: Distributed EM algorithms for density estimation and clustering in sensor networks. IEEE Trans. Signal Process. 51(8), 2245–2253 (2003)

    Article  Google Scholar 

  25. Pan, X., Gonzalez, J., Jegelka, S., Broderick, T., Jordan, M.: Optimistic concurrency control for distributed unsupervised learning. In: Proceedings of 27th Annual Conference on Neural Information Processing Systems. pp. 1403–1411 (2013)

    Google Scholar 

  26. Panella, M.: A hierarchical procedure for the synthesis of ANFIS networks. Adv. Fuzzy Syst. 2012, 1–12 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  27. Panella, M., Rizzi, A., Martinelli, G.: Refining accuracy of environmental data prediction by MoG neural networks. Neurocomputing 55(3–4), 521–549 (2003)

    Article  Google Scholar 

  28. Parisi, R., Cirillo, A., Panella, M., Uncini, A.: Source localization in reverberant environments by consistent peak selection. In: Proceedings of ICASSP. vol. 1, pp. I–37–I–40 (2007)

    Google Scholar 

  29. Rahmi, S., Zargham, M., Thakre, A., Chhillar, D.: A parallel fuzzy c-mean algorithm for image segmentation. In: Proceedings of NAFIPS’04. vol. 1, pp. 234–237 (2004)

    Google Scholar 

  30. Silva-Pereira, S., Pages-Zamora, A., Lopez-Valcarce, R.: A diffusion-based distributed EM algorithm for density estimation in wireless sensor networks. In: Proceedings of ICASSP. pp. 4449–4453 (2013)

    Google Scholar 

  31. Strehl, A., Ghosh, J.: Cluster ensembles a knowledge reuse framework for combining multiple partitions. J. Mach. Learn. Res. 3, 583–617 (2002)

    Google Scholar 

  32. Tasoulis, D., Vrahatis, M.: Unsupervised distributed clustering. Parallel Distrib. Comput. Netw. pp. 347–351 (2004)

    Google Scholar 

  33. Towfic, Z., Chen, J., Sayed, A.: Collaborative learning of mixture models using diffusion adaptation. In: Proceedings of IEEE International Workshop on Machine Learning for Signal Processing (MLSP). pp. 1–6 (2011)

    Google Scholar 

  34. Vendramin, L.: Estudo e desenvolvimento de algoritmos para agrupamento fuzzy de dados em cenarios centralizados e distribuidos. http://www.teses.usp.br/teses/disponiveis/55/55134/tde-10092012-163429/publico/LucasVendramin.pdf. Accessed 04 April 2015

  35. Vendramin, L., Campello, R., Coletta, L., Hruschka, E.: Distributed fuzzy clustering with automatic detection of the number of clusters. In: Proceedings of International Symposium on Distributed Computing and Artificial Intelligence, Advances in Intelligent and Soft Computing. vol. 91, pp. 133–140 (2011)

    Google Scholar 

  36. Wang, H., Li, Z., Cheng, Y.: Distributed and parallelled EM algorithm for distributed cluster ensemble. In: Proceedings of Pacific-Asia Workshop on Computational Intelligence and Industrial Application (PACIIA’08). vol. 2, pp. 3–8 (2008)

    Google Scholar 

  37. Wang, H., Shan, H., Banerjee, A.: Bayesian cluster ensembles, machine learning and knowledge discovery. In: Lecture Notes in Computer Science, vol. 6323, pp. 435–450. Springer (2009)

    Google Scholar 

  38. Wemmert, C., Gançarski, P., Korczak, J.: A collaborative approach to combine multiple learning methods. Int. J. Artif. Intell. Tools 9(1), 59–78 (2000)

    Article  Google Scholar 

  39. Xu, X., Jager, J., Kriegel, H.: A fast parallel clustering algorithm for large spatial databases. Data Min. Knowl. Discov. 3(3), 263–290 (1999)

    Article  Google Scholar 

  40. Younis, O., Fahmy, S.: Distributed clustering in ad-hoc sensor networks: a hybrid, energy-efficient approach. IEEE Trans. Mob. Comput. 3(4), 366–379 (2004)

    Article  Google Scholar 

  41. Zhen, M., Ji, G.: DK-means, an improved distributed clustering algorithm. J. Comput. Res. Dev. 44(2), 84–88 (2007)

    Google Scholar 

  42. Zhou, J., Chen, C.P., Chen, L., Li, H.X.: A collaborative fuzzy clustering algorithm in distributed network environments. IEEE Trans. Fuzzy Syst. 22(6), 1443–1456 (2014)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Massimo Panella .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Rosato, A., Altilio, R., Panella, M. (2016). Recent Advances on Distributed Unsupervised Learning. In: Bassis, S., Esposito, A., Morabito, F., Pasero, E. (eds) Advances in Neural Networks. WIRN 2015. Smart Innovation, Systems and Technologies, vol 54. Springer, Cham. https://doi.org/10.1007/978-3-319-33747-0_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-33747-0_8

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-33746-3

  • Online ISBN: 978-3-319-33747-0

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics