Performance of the K-means and fuzzy C-means algorithms in big data analytics

Salman, Zainab; Alomary, Alauddin

doi:10.1007/s41870-023-01436-y

Performance of the K-means and fuzzy C-means algorithms in big data analytics

Original Research
Published: 31 October 2023

Volume 16, pages 465–470, (2024)
Cite this article

International Journal of Information Technology Aims and scope Submit manuscript

69 Accesses
1 Citation
Explore all metrics

Abstract

Nowadays, cloud computing is used by most organizations to utilize cloud resources and services in dealing with big data. Besides, machine learning techniques are applied to extract the most valuable information from raw data provided by different resources. This paper evaluates the performance of the K-means (KM) and fuzzy C-means (FCM) algorithms in terms of the clustering time and accuracy. Data clustering has been applied to encrypted data as an instance of data analytics in both distributed and centralized-based environments. Furthermore, two different datasets are used in the designed framework, one which contains well-separated data and the other with overlapped data to check the performance of the two clustering algorithms for both data types. Moreover, adding different numbers of virtual machines in a distributed environment can obviously speed up the clustering time and reduce computational overheads. Experiments show that with overlapped data, FCM can obtain better accuracy than KM, whereas, in the case of well-separated data, the KM algorithm performs better than FCM with higher accuracy and less clustering time.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Big Data Clustering Algorithm Based on Computer Cloud Platform

MapReduce-based Fuzzy C-means Algorithm for Distributed Document Clustering

Article 19 July 2021

Hadoop with Intuitionistic Fuzzy C-Means for Clustering in Big Data

Data availability

The used datasets have been provided from the Kaggle website https://www.kaggle.com/.

References

Ayed AB, Halima MB, Alimi AM (2014) Survey on clustering methods: towards fuzzy clustering for big data. IEEE. https://doi.org/10.1109/socpar.2014.7008028
Article Google Scholar
Hurbungs V, Bassoo V, Fowdur TP (2021) Fog and edge computing: concepts, tools and focus areas. Int J Inf Technol 13(2):511–522. https://doi.org/10.1007/s41870-020-00588-5
Article Google Scholar
Idrees SM, Alam MA, Agarwal P (2018) A study of big data and its challenges. Int J Inf Technol 11(4):841–846. https://doi.org/10.1007/s41870-018-0185-1
Article Google Scholar
Salman Z, Hammad M (2021) Securing cloud computing: a review. Int J Comput Digit Syst 10(1):545–554. https://doi.org/10.12785/ijcds/100152
Article Google Scholar
Patibandla RSML, Veeranjaneyulu N (2018) Survey on clustering algorithms for unstructured data. Springer Singapore, Singapore, pp 421–429. https://doi.org/10.1007/978-981-10-7566-7_41
Book Google Scholar
Wiharto W, Suryani E (2020) The comparison of clustering algorithms k-means and fuzzy c-means for segmentation retinal blood vessels. Acta Inf Med 28(1):42. https://doi.org/10.5455/aim.2020.28.42-47
Article Google Scholar
Jain V (2017) Perspective analysis of telecommunication fraud detection using data stream analytics and neural network classification based data mining. Int J Inf Technol 9(3):303–310. https://doi.org/10.1007/s41870-017-0036-5
Article Google Scholar
Shaikh TA, Ali R (2019) Big data for better Indian healthcare. Int J Inf Technol 11(4):735–741. https://doi.org/10.1007/s41870-019-00342-6
Article Google Scholar
Ngo VM, Duong T-VT, Nguyen T-B-T, Dang CN, Conlan O (2023) A big data smart agricultural system: recommending optimum fertilisers for crops. Int J Inf Technol 15(1):249–265. https://doi.org/10.1007/s41870-022-01150-1
Article Google Scholar
Arunkumar N et al (2018) K-Means clustering and neural network for object detecting and identifying abnormality of brain tumor. Soft Comput 23(19):9083–9096. https://doi.org/10.1007/s00500-018-3618-7
Article Google Scholar
Anas M, Gupta K, Ahmad S (2017) Skin cancer classification using k-means clustering. Int J Tech Res Appl 5(1):62–65
Google Scholar
Aung YY, Min MM (2018) Hybrid intrusion detection system using k-means and classification and regression trees algorithms. IEEE. https://doi.org/10.1109/sera.2018.8477203
Article Google Scholar
HussianHassan AA, Shah WM, Othman MFI, Hassan HAH (2020) Evaluate the performance of k-means and the fuzzy c-means algorithms to formation balanced clusters in wireless sensor networks. Int J Electr Comput Eng (IJECE) 10(2):1515. https://doi.org/10.11591/ijece.v10i2.pp1515-1523
Article Google Scholar
Alabdulatif A, Khalil I, Yi X (2020) Towards secure big data analytic for cloud-enabled applications with fully homomorphic encryption. J Parallel Distrib Comput 137:192–204. https://doi.org/10.1016/j.jpdc.2019.10.008
Article Google Scholar
Salman Z, Alomary A (2022) An efficient approach to reduce the encryption and decryption time based on the concept of unique values. IEEE. https://doi.org/10.1109/3ict56508.2022.9990852
Article Google Scholar
Alam MS et al (2019) Automatic human brain tumor detection in MRI image using template-based k means and improved fuzzy c means clustering algorithm. Big Data Cogn Comput 3(2):27. https://doi.org/10.3390/bdcc3020027
Article Google Scholar

Download references

Author information

Authors and Affiliations

College of Information Technology, University of Bahrain, Zallaq, Sukhir, 32038, Southern Governate, Kingdom of Bahrain
Zainab Salman & Alauddin Alomary

Authors

Zainab Salman
View author publications
You can also search for this author in PubMed Google Scholar
Alauddin Alomary
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zainab Salman.

Ethics declarations

Conflict of interest

The authors declare that they have no competing interests.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Salman, Z., Alomary, A. Performance of the K-means and fuzzy C-means algorithms in big data analytics. Int. j. inf. tecnol. 16, 465–470 (2024). https://doi.org/10.1007/s41870-023-01436-y

Download citation

Received: 24 February 2023
Accepted: 21 August 2023
Published: 31 October 2023
Issue Date: January 2024
DOI: https://doi.org/10.1007/s41870-023-01436-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Performance of the K-means and fuzzy C-means algorithms in big data analytics

Abstract

Access this article

Similar content being viewed by others

Big Data Clustering Algorithm Based on Computer Cloud Platform

MapReduce-based Fuzzy C-means Algorithm for Distributed Document Clustering

Hadoop with Intuitionistic Fuzzy C-Means for Clustering in Big Data

Data availability

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Performance of the K-means and fuzzy C-means algorithms in big data analytics

Abstract

Access this article

Similar content being viewed by others

Big Data Clustering Algorithm Based on Computer Cloud Platform

MapReduce-based Fuzzy C-means Algorithm for Distributed Document Clustering

Hadoop with Intuitionistic Fuzzy C-Means for Clustering in Big Data

Data availability

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation