Skip to main content

Ball K-Medoids: Faster and Exacter

  • Conference paper
  • First Online:
Advances in Artificial Intelligence and Security (ICAIS 2021)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1422))

Included in the following conference series:

  • 1178 Accesses

Abstract

Cluster analysis can be viewed as a result of the natural evolution of the vast amount of data from daily life, and can discover invisible feature information to contribute to the analysis. K-means algorithm is one of the wide data clustering methods in a variety of real-world applications thanks to its simpleness. However, the k-means is sensitive to noise and outlier data points because a small number of such data can substantially influence the mean value of the cluster. In light of this, the k-medoids algorithm selects a point as a new center that minimizes the sum of the dissimilarities in the cluster, to diminish such sensitivity to outliers. Nevertheless, the line of the k-medoids algorithm is limited by its amounts of computation and not to handle with data efficiently. To this end, we present a novel k-medoids algorithm motivated by the theory of ball cluster, relationship between clusters and partitioning cluster for assigning samples into their nearest medoids efficiently, called ball k-medoids, which drop the distance calculation of sample-medoid significantly. Moreover, a threshold is inferenced by the rollback method for reducing computation of medoid-medoid distance and accelerating clustering. Experiments finally demonstrate that the performance of ball k-medoids achieves more efficient in comparison with other k-medoids algorithms, and it performs exacter accuracy compared with k-means.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Aggarwal, C., Reddy, C.: Data Clustering: Algorithms and Applications. Chapman & Hall/CRC Press, Boca Raton, FL, USA (2013)

    Book  Google Scholar 

  2. Han, J., Kamber, M., Pei, J.: Data Mining: Concepts and Techniques. Morgan Kaufmann Publishers, San Francisco, CA, USA (2012)

    MATH  Google Scholar 

  3. Xiao, Z., Xu, X., Xing, H., Chen, J.: RTFN: Robust Temporal Feature Network. arXiv preprint arXiv:2008.07707 (2020)

  4. Huang, Y., Wang, D., Sun, Y., Hang, B.: A fast intra coding algorithm for HEVC by jointly utilizing naive Bayesian and SVM. Multimedia Tools Appl. 79(45–46), 33957–33971 (2020). https://doi.org/10.1007/s11042-020-08882-x

    Article  Google Scholar 

  5. Shepitsen, A., Gemmell, J., Mobasher, B., Burke, R.: Personalized recommendation in social tagging systems using hierarchical clustering. In: Proceedings of the ACM Conference on Recommender Systems (RECSYS), New York, MY, USA, pp. 259–266 (2008)

    Google Scholar 

  6. Chien, Y.: Pattern classification and scene analysis. IEEE Trans. Autom. Control 19(4), 462–463 (1974)

    Article  Google Scholar 

  7. Tao, D., Li, X., Gao, X.: Large sparse cone non-negative matrix factorization for image annotation. In: ACM Transactions on Intelligent Systems and Technology (TIST), vol. 8, no. 3, p. 37 (2017)

    Google Scholar 

  8. Medan, G., Shamul, N., Joskowicz, L.: Sparse 3D radon space rigid registration of CT scans: method and validation study. IEEE Trans. Med. Imaging 36(2), 497–506 (2017)

    Article  Google Scholar 

  9. Lloyd, S.: Least squares quantization in PCM. IEEE Trans. Inf. Theory 28(2), 129–137 (1982)

    Article  MathSciNet  Google Scholar 

  10. Kaufman, L., Rousseeuw, P.J.: Finding Groups in Data: an Introduction to Cluster Analysis. John Wiley & Sons, Hoboken (1990)

    Google Scholar 

  11. Zhang, T., Ramakrishnan, R., Livny, M.: BIRCH: a new data clustering algorithm and its applications. Data Min. Knowl. Disc. 1(2), 141–182 (1997)

    Article  Google Scholar 

  12. Karypis, G., Han, E.-H., Kumar, V.: Chameleon: hierarchical clustering using dynamic modeling. Computer. 32(8), 68–75 (1999)

    Google Scholar 

  13. Ester, M., Kriegel, H.-P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings ACM Conference on Knowledge Discovery and Data Mining, Oregon, Portland, pp. 226–231 (1996)

    Google Scholar 

  14. Ankerst, M., Breunig, M., Kriegel, H.-P., Sander, J.: OPTICS: ordering points to identify the clustering structure. In: Proceedings of the ACM SIGMOD international conference on Management of data, New York, NY, USA, pp. 49–60 (1999)

    Google Scholar 

  15. Wang, W., Yang, J., Muntz, R.R.: STING: a statistical information grid approach to spatial data mining. In: Proceedings of the International Conference on Very Large Data Bases, San Francisco, CA, USA, pp 186–195 (1997)

    Google Scholar 

  16. Kaufman, L., Rousseeuwm, P.J.: Clustering by means of medoids. In: Statistical Data Analysis Based on the L1 Norm and Related Methods, pp. 405–416 (1987)

    Google Scholar 

  17. Ng, R.T., Han, J.: Efficient and effective clustering methods for spatial data mining. In: Proceedings of the International Conference on Very Large Data Bases, San Francisco, CA, USA, pp. 144–155 (1994)

    Google Scholar 

  18. Ng, R.T., Han, J.: CLARANS: a method for clustering objects for spatial data mining. In: IEEE Transactions on Knowledge and Data Engineering, vol. 14, no. 5, pp. 1003–1016 (2002)

    Google Scholar 

  19. Park, H.-S., Jun, C.-H.: A simple and fast algorithm for k-medoids clustering. Expert Syst. Appl. 36(2), 3336–3341 (2009)

    Article  Google Scholar 

  20. Xia, S., et al.: A fast adaptive k-means with no bounds. In: IEEE Transactions on Pattern Analysis and Machine Intelligence, p. 1 (2020)

    Google Scholar 

  21. Han, J., Kamber, M., Tung, A.K.H.: Spatial clustering methods in data mining: a survey. In: Geographic Data Mining and Knowledge Discovery (2001)

    Google Scholar 

  22. Lucasius, C.B., Dane, A.D., Kateman, G.: On k-medoid clustering of large data sets with the aid of a genetic algorithm: background, feasibility and comparison. Anal. Chim. Acta 282(3), 647–669 (1993)

    Article  Google Scholar 

  23. Gao, S., Zhou, X., Li, S.: Improved K-medoids clustering based on Gray association rule. In: Patnaik, S., Jain, V. (eds.) Recent Developments in Intelligent Computing, Communication and Devices. AISC, vol. 752, pp. 349–356. Springer, Singapore (2019). https://doi.org/10.1007/978-981-10-8944-2_41

    Chapter  Google Scholar 

Download references

Acknowledgement

The authors really appreciate the handling associate editor and all innominate reviews for their valuable comments. This work is supported by the National Natural Science Foundation of China (No. 62076042, No. 61572086), the Key Research and Development Project of Sichuan Province (No. 2020YFG0307, No. 2018TJPT0012), the Key Research and Development Project of Chengdu (No. 2019-YF05–02028-GX), the Innovation Team of Quantum Security Communication of Sichuan Province (No. 17TD0009), Sichuan Science and Technology Program under Grants 2018RZ0072, 20ZDYF0660, the Foundation of Chengdu University of Information Technology under Grant J201707, the National Key R&D Program of China under Grant (No. 2017YFB0802300), the Key Research and Development Project of Chengdu (No. 2019-YF05–02028-GX).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shibin Zhang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Peng, Q., Zhang, S., Zhang, J., Huang, Y., Yao, B., Tang, H. (2021). Ball K-Medoids: Faster and Exacter. In: Sun, X., Zhang, X., Xia, Z., Bertino, E. (eds) Advances in Artificial Intelligence and Security. ICAIS 2021. Communications in Computer and Information Science, vol 1422. Springer, Cham. https://doi.org/10.1007/978-3-030-78615-1_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-78615-1_16

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-78614-4

  • Online ISBN: 978-3-030-78615-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics