Skip to main content
Log in

Liver disorder detection using variable- neighbor weighted fuzzy K nearest neighbor approach

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

The human liver disorder is a genetic problem due to the habituality of alcohol or effect by the virus. It can lead to liver failure or liver cancer, if not been detected in initial stage. The aim of the proposed method is to detect the liver disorder in initial stage using liver function test dataset. The problem with many real-world datasets including liver disease diagnosis data is class imbalanced. The word imbalance refers to the conditions that the number of observations belongs to one class having more or less than the other class(es). Traditional K- Nearest Neighbor (KNN) or Fuzzy KNN classifier does not work well on the imbalanced dataset because they treat the neighbor equally. The weighted variant of Fuzzy KNN assign a large weight for the neighbor belongs to the minority class data and relatively small weight for the neighbor belongs to the majority class to resolve the issues with data imbalance. In this paper, Variable- Neighbor Weighted Fuzzy K Nearest Neighbor Approach (Variable-NWFKNN) is proposed, which is an improved variant of Fuzzy-NWKNN. The proposed Variable-NWFKNN method is implemented on three real-world imbalance liver function test datasets BUPA, ILPD from UCI and MPRLPD. The Variable-NWFKNN is compared with existing NWKNN and Fuzzy-NWKKNN methods and found accuracy 73.91% (BUPA Dataset), 77.59% (ILPD Dataset) and 87.01% (MPRLPD Dataset). Further, TL_RUS method is used for preprocessing and it improved the accuracy as 78.46% (BUPA Dataset), 78.46% (ILPD Dataset) and 95.79% (MPRLPD Dataset).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. Abdar M, Yen NY, Hung JCS (2017) Improving the diagnosis of liver disease using multilayer perceptron neural network and boosted decision trees. J Med Biol Eng 38(6):953–965

    Article  Google Scholar 

  2. Abdar M, Zomorodi-Moghadam M, Das R, Ting IH (2017) Performance analysis of classification algorithms on early detection of liver disease. Expert Syst Appl 67:239–251

    Article  Google Scholar 

  3. Al Shalabi L, Shaaban Z (2006) Normalization as a preprocessing engine for data mining and the approach of preference matrix. In: 2006 International conference on dependability of computer systems. IEEE, pp 207–214

  4. Alfisahrin SNN, Mantoro T (2013) Data mining techniques for optimization of liver disease classification. In: 2013 International conference on advanced computer science applications and technologies. IEEE, pp 379–384

  5. Bach M, Werner A, Zywiec J, Pluskiewicz W (2017) The study of under-and over-sampling methods’ utility in analysis of highly imbalanced data on osteoporosis. Inform Sci 384:174–190

    Article  Google Scholar 

  6. Basha SM, Rajput DS (2019) A roadmap towards implementing parallel aspect level sentiment analysis. Multimed Tools Appl, 1–30

  7. Basha SM, Rajput DS, Vandhan V (2018) Impact of gradient ascent and boosting algorithm in classification. Int J Intell Eng Syst (IJIES) 11(1):41–49

    Google Scholar 

  8. Batista GE, Prati RC, Monard MC (2004) A study of the behavior of several methods for balancing machine learning training data. ACM SIGKDD Explor Newslett 6(1):20–29

    Article  Google Scholar 

  9. Bennin KE, Keung J, Phannachitta P, Monden A, Mensah S (2018) Mahakil: diversity based oversampling approach to alleviate the class imbalance issue in software defect prediction. IEEE Trans Softw Eng 44(6):534–550

    Article  Google Scholar 

  10. Bond EJ, Li X, Hagness SC, Van Veen BD (2003) Microwave imaging via space-time beamforming for early detection of breast cancer. IEEE Trans Antennas Propag 51(8):1690–1705

    Article  MathSciNet  Google Scholar 

  11. Brownlee J (2016) How to normalize and standardize your machine learning data in weka. https://machinelearningmastery.com/normalize-standardize-machine-learning-data-weka/, accessed on 04/02/2019

  12. Chikh MA, Saidi M, Settouti N (2012) Diagnosis of diabetes diseases using an artificial immune recognition system2 (airs2) with fuzzy k-nearest neighbor. J Med Syst 36(5):2721–2729

    Article  Google Scholar 

  13. Chuang CL (2011) Case-based reasoning support for liver disease diagnosis. Artif Intell Med 53(1):15–23

    Article  Google Scholar 

  14. Cover TM, Hart PE, et al. (1967) Nearest neighbor pattern classification. IEEE Trans Inform Theory 13(1):21–27

    Article  Google Scholar 

  15. Devi D, Purkayastha B, et al. (2017) Redundancy-driven modified tomek-link based undersampling: a solution to class imbalance. Pattern Recogn Lett 93:3–12

    Article  Google Scholar 

  16. Dietterich TG (2000) Ensemble methods in machine learning. In: International workshop on multiple classifier systems. Springer, pp 1–15

  17. Dorj UO, Lee KK, Choi JY, Lee M (2018) The skin cancer classification using deep convolutional neural network. Multimed Tools Appl 77(8):9909–9924

    Article  Google Scholar 

  18. Esposito M, De Falco I, De Pietro G (2011) An evolutionary-fuzzy dss for assessing health status in multiple sclerosis disease. Int J Med Inform 80(12):e245–e254

    Article  Google Scholar 

  19. Galar M, Fernandez A, Barrenechea E, Bustince H, Herrera F (2012) A review on ensembles for the class imbalance problem: bagging-, boosting-, and hybrid-based approaches. IEEE Trans Syst Man Cybern Part C (Appl Rev) 42(4):463–484

    Article  Google Scholar 

  20. Gong J, Kim H (2017) Rhsboost: improving classification performance in imbalance data. Comput Stat Data Anal 111:1–13

    Article  MathSciNet  Google Scholar 

  21. Han J, Pei J, Kamber M (2011) Data mining: concepts and techniques. Elsevier

  22. Hashem S, Esmat G, Elakel W, Habashy S, Raouf SA, Elhefnawi M, Eladawy MI, ElHefnawi M (2018) Comparison of machine learning approaches for prediction of advanced liver fibrosis in chronic hepatitis c patients. IEEE/ACM Trans Comput Biol Bioinform 15(3):861–868

    Article  Google Scholar 

  23. He H, Garcia EA (2008) Learning from imbalanced data. IEEE Trans Knowl Data Eng 9:1263–1284

    Google Scholar 

  24. Ishtiaq U, Kareem SA, Abdullah ERMF, Mujtaba G, Jahangir R, Ghafoor HY (2019) Diabetic retinopathy detection through artificial intelligent techniques: a review and open issues. Multimed Tools Appl, 1–44

  25. Kang Q, Chen X, Li S, Zhou M (2017) A noise-filtered under-sampling scheme for imbalanced classification. IEEE Trans Cybern 47(12):4263–4274

    Article  Google Scholar 

  26. Kantardzic M (2011) Data mining: concepts, models, methods, and algorithms. Wiley

  27. Kaur P, Kumar R, Kumar M (2019) A healthcare monitoring system using random forest and internet of things (iot). Multimed Tools Appl, 1–12

  28. Kavakiotis I, Tsave O, Salifoglou A, Maglaveras N, Vlahavas I, Chouvarda I (2017) Machine learning and data mining methods in diabetes research. Comput Struct Biotechnol J 15:104–116

    Article  Google Scholar 

  29. Keller JM, Gray MR, Givens JA (1985) A fuzzy k-nearest neighbor algorithm. IEEE Trans Syst Man Cybern 4:580–585

    Article  Google Scholar 

  30. Khakhar A (2017) A liver diseases in india. http://www.livertransplant.org/liver-transplantation/awareness/liver-diseases-in-india-stats, accessed on 08/04/2019

  31. Kumar S, Biswas SK, Devi D (2018) Tlusboost algorithm: a boosting solution for class imbalance problem. Soft Comput, 1–13

  32. Lin RH (2009) An intelligent model for liver disease diagnosis. Artif Intell Med 47(1):53–62

    Article  Google Scholar 

  33. Lin RH, Chuang CL (2010) A hybrid diagnosis model for determining the types of the liver disease. Comput Biol Med 40(7):665–670

    Article  Google Scholar 

  34. Liu DY, Chen HL, Yang B, Lv XE, Li LN, Liu J (2012) Design of an enhanced fuzzy k-nearest neighbor classifier based computer aided diagnostic system for thyroid disease. J Med Syst 36(5):3243–3254

    Article  Google Scholar 

  35. Media L (2017) World health ranking. https://www.worldlifeexpectancy.com/india-liver-disease, accessed on 08/04/2019

  36. Meng D, Zhang L, Cao G, Cao W, Zhang G, Hu B (2017) Liver fibrosis classification based on transfer learning and fcnet for ultrasound images. IEEE Access 5:5804–5810

    Google Scholar 

  37. Patel H, Thakur GS (2017) Classification of imbalanced data using a modified fuzzy-neighbor weighted approach. Int J Intell Eng Syst 10(1):56–64

    Article  Google Scholar 

  38. Patel H, Thakur G (2018) An improved fuzzy k-nearest neighbor algorithm for imbalanced data using adaptive approach. IETE J Res, 1–10

  39. Peng L, Zhang H, Yang B, Chen Y (2014) A new approach for imbalanced data classification based on data gravitation. Inform Sci 288:347–373

    Article  Google Scholar 

  40. Priya RV (2019) Emotion recognition from geometric fuzzy membership functions. Multimed Tools Appl, 1–32

  41. Seiffert C, Khoshgoftaar TM, Van Hulse J, Napolitano A (2010) Rusboost: a hybrid approach to alleviating class imbalance. IEEE Trans Syst Man Cybern-Part A: Syst Humans 40(1):185–197

    Article  Google Scholar 

  42. Tan S (2005) Neighbor-weighted k-nearest neighbor for unbalanced text corpus. Expert Syst Appl 28(4):667–671

    Article  Google Scholar 

  43. Tiwari V, Tiwari B, Thakur RS, Gupta S (2016) Pattern and data analysis in healthcare settings. IGI Global

  44. UCI (2012) Ilpd (indian liver patient dataset) data set. https://archive.ics.uci.edu/ml/datasets/ILPD+(Indian+Liver+Patient+Dataset), accessed on 25/05/2018

  45. Witten IH, Frank E, Hall MA, Pal CJ (2016) Data mining: practical machine learning tools and techniques. Morgan Kaufmann

  46. Yan Y, Liu R, Ding Z, Du X, Chen J, Zhang Y (2019) A parameter-free cleaning method for smote in imbalanced classification. IEEE Access 7:23537–23548

    Article  Google Scholar 

  47. Yu HF (2019) Bibliographic automatic classification algorithm based on semantic space transformation. Multimed Tools Appl, 1–15

  48. Yu C, Chen H, Li Y, Peng Y, Li J, Yang F (2019) Breast cancer classification in pathological images based on hybrid features. Multimed Tools Appl, 1–21

  49. Zhou X, Zhang Y, Shi M, Shi H, Zheng Z (2014) Early detection of liver disease using data visualisation and classification method. Biomed Signal Process Control 11:27–35

    Article  Google Scholar 

  50. Zomaya AY, Sakr S (2017) Handbook of big data technologies. Springer

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Pushpendra Kumar.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kumar, P., Thakur, R.S. Liver disorder detection using variable- neighbor weighted fuzzy K nearest neighbor approach. Multimed Tools Appl 80, 16515–16535 (2021). https://doi.org/10.1007/s11042-019-07978-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-019-07978-3

Keywords

Navigation