Skip to main content
Log in

Determination of Customer Satisfaction using Improved K-means algorithm

  • Methodologies and Application
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

Effective management of customer’s knowledge leads to efficient Customer Relationship Management (CRM). To accurately predict customer’s behaviour, clustering, especially K-means, is one of the most important data mining techniques used in customer relationship management marketing, with which it is possible to identify customers’ behavioural patterns and, subsequently, to align marketing strategies with customer preferences so as to maintain the customers. However, it has been observed in various studies on K-means clustering that customers with different behavioural indicators in clustering may seem to be the same, implying that customer behavioural indicators do not play any significant role in customer clustering. Therefore, if the level of customer participation depends on behavioural parameters such as their satisfaction, it can have a negative effect on the K-means clusters and has no acceptable result. In this paper, customer behavioural features—malicious feature—is considered in customer clustering, as well as a method for finding the optimal number of clusters and the initial values of cluster centres to obtain more accurate results. Finally, according to the organizations’ need to extract knowledge from customers’ views through ranking customers based on factors affecting customer value, a method is proposed for modelling their behaviour and extracting knowledge for customer relationship management. The results of the evaluation of the customers of Hamkaran System’s Company show that the improved K-means method proposed in this paper outperforms K-means in terms of speed and accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  • Alsaç A, Çolak M, Keskin GA (2017) An integrated customer relationship management and Data Mining framework for customer classification and risk analysis in health sector. In: IEEE International Conference on Industrial Technology and Management (ICITM), pp 41–46

  • Alvandi M, Fazli S, Abdoli FS (2012) K-mean clustering method for analysis customer lifetime value with LRFM relationship model in banking services. Int Res J Appl Basic Sci 3(11):2294–2302

    Google Scholar 

  • Anitha P, Patil MM (2019) RFM model for customer purchase behavior using K-means algorithm. J King Saud Univ Comput Inf Sci. https://doi.org/10.1016/j.jksuci.2019.12.011

    Article  Google Scholar 

  • Ansari A, Riasi A (2016) Customer clustering using a combination of fuzzy c-means and genetic algorithms. Int J Bus Manag 11(7):59–66

    Google Scholar 

  • Arthur D, Vassilvitskii S (2007) K-means++: the advantages of careful seeding. In: Proceedings of the eighteenth annual ACM-SIAM symposium on discrete algorithms, pp 1027–1035. Society for Industrial and Applied Mathematics

  • Bablani A, Edla DR, Kuppili V, Ramesh D (2020) A multi stage EEG data classification using K-means and feed forward neural network. Clin Epidemiol Glob Health. https://doi.org/10.1016/j.cegh.2020.01.008

    Article  Google Scholar 

  • Bagirov AM (2008) Modified global K-means algorithm for minimum sum-of-squares clustering problems. Pattern Recogn 41(10):3192–3199

    MATH  Google Scholar 

  • Bagirov AM, Ugon J, Webb D (2011) Fast modified global K-means algorithm for incremental cluster construction. Pattern Recogn 44(4):866–876

    MATH  Google Scholar 

  • Bai L, Liang J, Guo Y (2018) An ensemble clusterer of multiple fuzzy k means clusterings to recognize arbitrarily shaped clusters. IEEE Trans Fuzzy Syst 26(6):3524–3533

    Google Scholar 

  • Baxter R, He H, Williams G, Hawkins S, Gu L (2002) An empirical comparison of outlier detection methods. In: Sixth Pacific-Asia conference on knowledge discovery and data mining (PAKDD-02)

  • Carnein M, Trautmann H (2019) Customer segmentation based on transactional data using stream clustering. In: Pacific-Asia conference on knowledge discovery and data mining, pp 280–292. Springer, Cham

  • Celebi ME, Kingravi HA, Vela PA (2013) A comparative study of efficient initialization methods for the K-means clustering algorithm. Expert Syst Appl 40(1):200–210

    Google Scholar 

  • Chen Y, Hu P, Wang W (2018) Improved K-means algorithm and its implementation based on mean shift. In: 2018 11th International congress on image and signal processing, biomedical engineering and informatics (CISP-BMEI), pp 1–5. IEEE

  • Chiang WY (2018) Applying data mining for online CRM marketing strategy. Br Food J. https://doi.org/10.1108/BFJ-02-2017-0075

    Article  Google Scholar 

  • Christy AJ, Umamakeswari A, Priyatharsini L, Neyaa A (2018) RFM ranking—an effective approach to customer segmentation. J King Saud Univ Comput Inf Sci. https://doi.org/10.1016/j.jksuci.2018.09.004

    Article  Google Scholar 

  • Danesh M, Naghibzadeh M, Totonchi MRA, Danesh M, Minaei B, Shirgahi H (2011) Data clustering based on an efficient hybrid of K-harmonic means, PSO and GA. In: Transactions on computational collective intelligence IV, pp 125–140. Springer, Berlin, Heidelberg

  • Deng CH, Zhao WL (2018) Fast K-means based on k-NN Graph. In: 2018 IEEE 34th international conference on data engineering (ICDE), pp 1220–1223. IEEE

  • Dong G, Jin Y, Wang S, Li W, Tao Z, Guo S (2019) DB-K means: an intrusion detection algorithm based on DBSCAN and K-means. In: 2019 20th Asia-Pacific network operations and management symposium (APNOMS), pp 1–4. IEEE

  • Dyche J (2002) The CRM handbook: a business guide to customer relationship management. Addison-Wesley Professional, Boston

    Google Scholar 

  • Erdil A, Öztürk A (2016) Improvement a quality oriented model for customer relationship management: a case study for shipment industry in Turkey. Procedia Soc Behav Sci 229:346–353

    Google Scholar 

  • Erisoglu M, Calis N, Sakallioglu S (2011) A new algorithm for initial cluster centers in K-means algorithm. Pattern Recogn Lett 32(14):1701–1705

    Google Scholar 

  • Eszergár-Kiss D, Caesar B (2017) Definition of user groups applying Ward’s method. Transp Res Procedia 22:25–34

    Google Scholar 

  • Fadaei A, Khasteh SH (2019) Enhanced K-means re-clustering over dynamic networks. Expert Syst Appl 132:126–140

    Google Scholar 

  • Feng Q, Zhu X, Pan JS (2015) Global linear regression coefficient classifier for recognition. Optik Int J Light Electron Opt 126(21):3234–3239

    Google Scholar 

  • Fränti P, Sieranoja S (2019) How much can K-means be improved by using better initialization and repeats? Pattern Recogn 93:95–112

    Google Scholar 

  • Gayathri A, Mohanavalli S (2011) Enhanced customer relationship management using fuzzy clustering. Int J Comput Sci Eng Technol 1(4):163–167

    Google Scholar 

  • Govender P, Sivakumar V (2019) Application of K-means and hierarchical clustering techniques for analysis of air pollution: a review (1980–2019). Atmos Pollut Res

  • Gu Y, Li K, Guo Z, Wang Y (2019) Semi-supervised K-means DDoS detection method using hybrid feature selection algorithm. IEEE Access 7:64351–64365

    Google Scholar 

  • Han J, Pei J, Kamber M (2011) Data mining: concepts and techniques. Elsevier, Amsterdam

    MATH  Google Scholar 

  • He BH, Song GF (2009) Knowledge management and data mining for supply chain risk management. In: IEEE international conference on management and service science, 2009, pp 1–4

  • Hu J, Li M, Zhu E, Wang S, Liu X, Zhai Y (2019) Consensus multiple kernel K-means clustering with late fusion alignment and matrix-induced regularization. IEEE Access 7:136322–136331

    Google Scholar 

  • Hussain SF, Haris M (2019) A K-means based co-clustering (kCC) algorithm for sparse, high dimensional data. Expert Syst Appl 118:20–34

    Google Scholar 

  • Ismkhan H (2018) Ik-means−+: an iterative clustering algorithm based on an enhanced version of the K-means. Pattern Recogn 79:402–413

    Google Scholar 

  • Jain AK, Murty MN, Flynn PJ (1999) Data clustering: a review. ACM Comput Surv 31(3):264–323

    Google Scholar 

  • Jiang ZL, Guo N, Jin Y, Lv J, Wu Y, Liu Z, Fang J, Yiu SM, Wang X (2020) Efficient two-party privacy-preserving collaborative K-means clustering protocol supporting both storage and computation outsourcing. Inf Sci 518:168–180

    MathSciNet  Google Scholar 

  • Jones PJ, James MK, Davies MJ, Khunti K, Catt M, Yates T, Rowlands AV, Mirkes EM (2020) FilterK: a new outlier detection method for K-means clustering of physical activity. J Biomed Inf 103397:1–29

    Google Scholar 

  • Kafashpour A, Tavakoli A, Alizadeh S (2012) Customers segmentation base on lifetime value, use RFM data mining. Iran J Public Manag 5(15):63–84

    Google Scholar 

  • Kanungo T, Mount DM, Netanyahu NS, Piatko CD, Silverman R, Wu AY (2002) An efficient K-means clustering algorithm: analysis and implementation. IEEE Trans Pattern Anal Mach Intell 24(7):881–892

    MATH  Google Scholar 

  • Karczmarek P, Kiersztyn A, Pedrycz W, Al E (2020) K-means-based isolation forest. Knowledge-Based Syst 105659:1–15

    Google Scholar 

  • Katsavounidis I, Kuo CCJ, Zhang Z (1994) A new initialization technique for generalized Lloyd iteration. IEEE Signal Process Lett 1(10):144–146

    Google Scholar 

  • Khalili-Damghani K, Abdi F, Abolmakarem S (2018) Hybrid soft computing approach based on clustering, rule mining, and decision tree analysis for customer segmentation problem: real case of customer-centric industries. Appl Soft Comput 73:816–828

    Google Scholar 

  • Kumar KM, Reddy ARM (2017) An efficient K-means clustering filtering algorithm using density based initial cluster centers. Inf Sci 418:286–301

    MathSciNet  MATH  Google Scholar 

  • Kumar V, Shah D, Venkatesan R (2006) Managing retailer profitability—one customer at a time! J Retail 82(4):277–294

    Google Scholar 

  • Lai JZ, Huang TJ (2010) Fast global K-means clustering using cluster membership and inequality. Pattern Recogn 43(5):1954–1963

    MATH  Google Scholar 

  • Laudon KC, Laudon JP (2015) Management information systems: managing the digital firm plus MyMISLab with Pearson eText–access card package. Prentice Hall Press, Upper Saddle River

    Google Scholar 

  • Li DC, Dai WL, Tseng WT (2011) A two-stage clustering method to analyze customer characteristics to build discriminative customer management: a case of textile manufacturing business. Expert Syst Appl 38(6):7186–7191

    Google Scholar 

  • Li X, Qin B, Zhu Z, Lin Q (2017) Study on application of data mining in customer acquisition. In: DEStech transactions on social science, education and human science, (eemt)

  • Liao SH, Chu PH, Hsiao PY (2012) Data mining techniques and applications—a decade review from 2000 to 2011. Expert Syst Appl 39(12):11303–11311

    Google Scholar 

  • Likas A, Vlassis N, Verbeek JJ (2003) The global K-means clustering algorithm. Pattern Recogn 36(2):451–461

    Google Scholar 

  • Lin CY (2020) A reversible privacy-preserving clustering technique based on K-means algorithm. Appl Soft Comput 87:105995

    Google Scholar 

  • MacQueen J (1967) Some methods for classification and analysis of multivariate observations. In: Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, vol. 1, no. 14, pp 281–297

  • Maghfirah MM, Adji TB, Setiawan NA (2015) Appropriate data mining technique and algorithm for using in analysis of customer relationship management (CRM) in bank industry. In: Seminar Nasional Aplikasi Teknologi Informasi (SNATI), vol. 1, no. 1

  • Manxi W, Liandong W, Chenfeng W, Xiaoguang G, Ruohai D (2018). Finding community structure of Bayesian networks by improved K-means algorithm. In: 2018 IEEE 3rd international conference on image, vision and computing (ICIVC), pp 865–869. IEEE

  • Maryani I, Riana D (2017) Clustering and profiling of customers using RFM for customer relationship management recommendations. In: IEEE 5th International Conference on Cyber and IT Service Management (CITSM), pp 1–6

  • Min Z, Kai-fei D (2015) Improved research to K-means initial cluster centers. In: 2015 Ninth international conference on frontier of computer science and technology, pp 349–353. IEEE

  • Mojena R (1977) Hierarchical grouping methods and stopping rules: an evaluation. Comput J 20(4):359–363

    MATH  Google Scholar 

  • Mukhlas A, Ahmad A, Zainun Z Berhad MP (2016) Data mining technique: towards supporting local co-operative society in customer profiling, market analysis and prototype construction. In: IEEE international conference on information and communication technology, pp 109–114

  • Nguyen B, De Baets B (2019) Kernel-based distance metric learning for supervised K-means clustering. IEEE Trans Neural Netw Learn Syst 30(10):3084–3095

    MathSciNet  Google Scholar 

  • Nithya A, Appathurai A, Venkatadri N, Ramji DR, Palagan CA (2020) Kidney disease detection and segmentation using artificial neural network and multi-kernel K-means clustering for ultrasound images. Measurement 149:106952

    Google Scholar 

  • Olson DL (2017) Recency frequency and monetary model. In: Descriptive data mining. Springer, Singapore

    Google Scholar 

  • Pawar RG (2016) Data mining: techniques for enhancing customer relationship management in fast moving consumer goods industries. Int Res J Multidiscip Stud 2(2):1–5

    Google Scholar 

  • Peker S, Kocyigit A, Eren PE (2017) LRFMP model for customer segmentation in the grocery retail industry: a case study. Market Intell Plan 35(4):544–559

    Google Scholar 

  • Prabha D, Subramanian RS (2017) A survey on customer relationship management. In: 4th IEEE international conference on advanced computing and communication systems (ICACCS), pp 1–5

  • Qadadeh W, Abdallah S (2018) Customers segmentation in the insurance company (TIC) dataset. Procedia Comput Sci 144:277–290

    Google Scholar 

  • Qiao J, Cai X, Xiao Q, Chen Z, Kulkarni P, Ferris C, Kamarthi S, Sridhar S (2019) Data on MRI brain lesion segmentation using K-means and Gaussian mixture model-expectation maximization. Data Brief 27:104628

    Google Scholar 

  • Rajeh SM, Koudehi FA, Seyedhosseini SM, Farazmand R (2014) A model for customer segmentation based on loyalty using data mining approach and fuzzy concept in Iranian Bank. Int J Bus Behav Sci 4(9):118–136

    Google Scholar 

  • Redmond SJ, Heneghan C (2007) A method for initialising the K-means clustering algorithm using kd-trees. Pattern Recogn Lett 28(8):965–973

    Google Scholar 

  • Riveros NAM, Espitia BAC, Pico LEA (2019) Comparison between K-means and self-organizing maps algorithms used for diagnosis spinal column patients. Inform Med Unlocked 16:100206

    Google Scholar 

  • Schubert E, Sander J, Ester M, Kriegel HP, Xu X (2017) DBSCAN revisited, revisited: why and how you should (still) use DBSCAN. ACM Trans Database Syst 42(3):19

    MathSciNet  Google Scholar 

  • Sharma V, Bala M (2020) An improved task allocation strategy in cloud using modified K-means clustering technique. Egypt Inform J. https://doi.org/10.1016/j.eij.2020.02.001

    Article  Google Scholar 

  • Shatnawi MQ, Yassein MB, Al-natour H (2017) Customer relationship management at Jordan University of science and technology: case study, issues and recommendations. In: IEEE international conference on engineering and technology (ICET), pp 1–6. IEEE

  • Shmueli G, Bruce PC, Yahav I, Patel NR, Lichtendahl KC Jr (2017) Data mining for business analytics: concepts, techniques, and applications in R. Wiley, Hoboken

    Google Scholar 

  • Sohrabi J, Hadavandi E (2011) Data mining in banking industry. Iranian Jahad Publishing, Amir Kabir University of Technology, Tehran, pp 25–70

    Google Scholar 

  • Subbalakshmi C, Krishna GR, Rao SKM, Rao PV (2015) A Method to find optimum number of clusters based on fuzzy silhouette on dynamic data set. Procedia Comput Sci 46:346–353

    Google Scholar 

  • Szekely GJ, Rizzo ML (2005) Hierarchical clustering via joint between-within distances: extending ward’s minimum variance method. J Classif 22(2)

  • Szulanski G (1996) Exploring internal stickiness: impediments to the transfer of best practice within the firm. Strateg Manag J 17(S2):27–43

    Google Scholar 

  • Tzortzis G, Likas A (2014) The MinMax K-means clustering algorithm. Pattern Recogn 47(7):2505–2516

    Google Scholar 

  • Wang H, Zhang J (2010) Study of customer segmentation for auto services companies based on RFM model. School of Management, Wuhan University of Technology, Wuhan

    Google Scholar 

  • Wang S, Zhu E, Hu J, Li M, Zhao K, Hu N, Liu X (2019) Efficient multiple kernel K-means clustering with late fusion. IEEE Access 7:61109–61120

    Google Scholar 

  • Xiaofeng Z, Xiaohong H (2017) Research on intrusion detection based on improved combination of K-means and multi-level SVM. In: 2017 IEEE 17th international conference on communication technology (ICCT), pp 2042–2045. IEEE

  • Khajvand M, Tarokh MJ (2011) Analyzing customer segmentation based on customer value components (case study: a private bank)

  • Yu SS, Chu SW, Wang CM, Chan YK, Chang TC (2018) Two improved K-means algorithms. Appl Soft Comput 68:747–755

    Google Scholar 

  • Yuliari NPP, Putra IKGD, Rusjayanti NKD (2015) Customer segmentation through fuzzy C-means and fuzzy RFM method. J Theor Appl Inf Technol 78(3):380–385

    Google Scholar 

  • Zahrotun L (2017) Implementation of data mining technique for customer relationship management (CRM) on online shop tokodiapers.com with fuzzy c-means clustering. In: IEEE 2nd international conferences on information technology, information systems and electrical engineering (ICITISEE), pp 299–303

  • Zhang GY, Wang CD, Huang D, Zheng WS, Zhou YR (2018) TW-Co-K-means: two-level weighted collaborative K-means for multi-view clustering. Knowl Based Syst 150:127–138

    Google Scholar 

  • Žiberna A (2020) K-means-based algorithm for blockmodeling linked networks. Soc Netw 61:153–169

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sima Emadi.

Ethics declarations

Conflict of interest

The authors declare that there is no conflict of interest regarding the publication of this paper.

Human and animal rights

This article does not contain any studies with human participants or animals performed by the authors.

Additional information

Communicated by V. Loia.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zare, H., Emadi, S. Determination of Customer Satisfaction using Improved K-means algorithm. Soft Comput 24, 16947–16965 (2020). https://doi.org/10.1007/s00500-020-04988-4

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-020-04988-4

Keywords

Navigation