Abstract
The major challenge related to data management lies in healthcare sector due to increase in patients proportional to the population growth and change in lifestyle. The data analytics and big data are becoming trends to provide solution to all analytical problems that can be obtained by using machine learning techniques. Today, cancer is evolving as one of the major attention seeking phenomenon in developed as well as in developing countries that may lead to death if not diagnosed at the early stage. The late diagnosis, and hence delayed treatment increase the risk for the survival. Thus, early detection to improve the cancer outcome is very critical. This study is intended towards early diagnosis of cancer using more efficient analytical techniques. Moreover, accuracy plays an important role in prediction to improve the quality of care, thereby increasing the survival rate. For this study, the datasets are extracted from UCI Machine Learning Repository prepared by University of Wisconsin Hospitals. For the diagnosis and classification process, K Nearest Neighbor (KNN) classifier is applied with different values of K variable, introducing the process called KNN Clustering. Later the performance of KNN is compared with K-Means clustering on the same datasets.
Similar content being viewed by others
References
http://www.cancer.org/acs/groups/content/@research/documents/document/acspc-046381.pdf. Accessed Dec 2016
Khosravanian A, Ayat S (2016) Diagnosing breast cancer type by using probabilistic neural network in decision support system. Int J Knowledge Eng 2(1):73–76. https://doi.org/10.18178/ijke.2016.2.1.056
Kalaivani S, Gandhimathi S (2015) An efficient bayes classification algorithm for analysis of breast cancer dataset using cross validation parameter. Int J Adv Res Comput Sci Softw Eng 5(10):430–434
Senturk ZK, Karal R (2014) Breast cancer diagnosis via data mining: performance analysis of seven different algorithms. Int J Comput Sci Eng (CSEIJ) 4(1):4775–4781
Imandoust SB, Bolandraftar M (2013) Application of K nearest neighbor (KNN) for predicting economic events: theoretical background. Int J Eng Res Appl 3(5):605–610
Bellaachia A, Guven E (2003) Predicting breast cancer survivability using data mining techniques. J Soc Ind Appl Math 7(1):37–42
Chi CL, Street WN, Wolberg WH (2007) Application of artificial neural network-based survival analysison two breast cancer datasets. Proceedings of AMIA 2007 Symposium
Buciński A, Bączek T, Krysiński J, Szoszkiewicz R, Załuski J (2007) Clinical data analysis using artificial neural networks (ANN) and principal component analysis (PCA) of patients with breast cancer after mastectomy. Rep Pract Oncol Radiother 12(1):9–17
Joshi J, Doshi R, Patel J (2014) Diagnosis of breast cancer using clustering data mining approach. Int J Comput Appl 101(10):13–17
Saleema JS et al (2014) Cancer prognosis prediction using balanced stratified sampling. Int J Soft Comput Artif Intell Appl (IJSCAI) 3(1)
Pandey A (2014) Study and analysis of K-means clustering algorithm using rapidminer a case study on students’ exam result. Int J Eng Res Appl 4(12):60–64 (ISSN: 2248-9622)
Parvin H, Alizadeh H, Minaei-Bidgoli B (2008) MKNN: modified K-nearest neighbour. Proceedings of World Congress in Engineering and Computer Science, USA
Salama GI, Abdelhalim MB, Zeid MA (2012) Breast cancer diagnosis on three different datasets using multi-classifiers. Int J Comput Inf Technol 1(1)
Madjahed SA, Saadi TA, Benyettou A (2013) Breast cancer diagnosis by using k-nearest neighbour with different diatances and classification rules. Int J Comput Appl 62(1):1–5
Kandhasomy JP, Balemurli S (2015) Performance analysis of classifier models to predict diabetes mellitus. Proc Comput Sci 47:45–51
Malarzhi R, Thanamani AS (2012) K-NN classifier performs better than K-means clustering in missing value imputation. IOSR J Comput Eng (IOSRJCE) 6(5):12–15 (ISSSN-2278-0061)
Manjisha M, Hari Kumar R (2016) Performance analysis of KNN and K-means custering for robust classification of epilepsy from EEG signals. Int Conf Wirel Commun Signal Process Netw (NISPNET). https://doi.org/10.1110/wispnet.2016.7566575
Sahu SK, Kumar P, Singh AP (2018) Modified K-NN algorithm for classification problems with improved accuracy. Int J Inf Technol IBJIT 10(1):65–70
https://sites.google.com/site/dataclusteringalgorithms/k-means-clustering-algorithm. Accessed Jan 2017
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Mittal, K., Aggarwal, G. & Mahajan, P. Performance study of K-nearest neighbor classifier and K-means clustering for predicting the diagnostic accuracy. Int. j. inf. tecnol. 11, 535–540 (2019). https://doi.org/10.1007/s41870-018-0233-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s41870-018-0233-x