Abstract
Effective management of customer’s knowledge leads to efficient Customer Relationship Management (CRM). To accurately predict customer’s behaviour, clustering, especially K-means, is one of the most important data mining techniques used in customer relationship management marketing, with which it is possible to identify customers’ behavioural patterns and, subsequently, to align marketing strategies with customer preferences so as to maintain the customers. However, it has been observed in various studies on K-means clustering that customers with different behavioural indicators in clustering may seem to be the same, implying that customer behavioural indicators do not play any significant role in customer clustering. Therefore, if the level of customer participation depends on behavioural parameters such as their satisfaction, it can have a negative effect on the K-means clusters and has no acceptable result. In this paper, customer behavioural features—malicious feature—is considered in customer clustering, as well as a method for finding the optimal number of clusters and the initial values of cluster centres to obtain more accurate results. Finally, according to the organizations’ need to extract knowledge from customers’ views through ranking customers based on factors affecting customer value, a method is proposed for modelling their behaviour and extracting knowledge for customer relationship management. The results of the evaluation of the customers of Hamkaran System’s Company show that the improved K-means method proposed in this paper outperforms K-means in terms of speed and accuracy.
Similar content being viewed by others
References
Alsaç A, Çolak M, Keskin GA (2017) An integrated customer relationship management and Data Mining framework for customer classification and risk analysis in health sector. In: IEEE International Conference on Industrial Technology and Management (ICITM), pp 41–46
Alvandi M, Fazli S, Abdoli FS (2012) K-mean clustering method for analysis customer lifetime value with LRFM relationship model in banking services. Int Res J Appl Basic Sci 3(11):2294–2302
Anitha P, Patil MM (2019) RFM model for customer purchase behavior using K-means algorithm. J King Saud Univ Comput Inf Sci. https://doi.org/10.1016/j.jksuci.2019.12.011
Ansari A, Riasi A (2016) Customer clustering using a combination of fuzzy c-means and genetic algorithms. Int J Bus Manag 11(7):59–66
Arthur D, Vassilvitskii S (2007) K-means++: the advantages of careful seeding. In: Proceedings of the eighteenth annual ACM-SIAM symposium on discrete algorithms, pp 1027–1035. Society for Industrial and Applied Mathematics
Bablani A, Edla DR, Kuppili V, Ramesh D (2020) A multi stage EEG data classification using K-means and feed forward neural network. Clin Epidemiol Glob Health. https://doi.org/10.1016/j.cegh.2020.01.008
Bagirov AM (2008) Modified global K-means algorithm for minimum sum-of-squares clustering problems. Pattern Recogn 41(10):3192–3199
Bagirov AM, Ugon J, Webb D (2011) Fast modified global K-means algorithm for incremental cluster construction. Pattern Recogn 44(4):866–876
Bai L, Liang J, Guo Y (2018) An ensemble clusterer of multiple fuzzy k means clusterings to recognize arbitrarily shaped clusters. IEEE Trans Fuzzy Syst 26(6):3524–3533
Baxter R, He H, Williams G, Hawkins S, Gu L (2002) An empirical comparison of outlier detection methods. In: Sixth Pacific-Asia conference on knowledge discovery and data mining (PAKDD-02)
Carnein M, Trautmann H (2019) Customer segmentation based on transactional data using stream clustering. In: Pacific-Asia conference on knowledge discovery and data mining, pp 280–292. Springer, Cham
Celebi ME, Kingravi HA, Vela PA (2013) A comparative study of efficient initialization methods for the K-means clustering algorithm. Expert Syst Appl 40(1):200–210
Chen Y, Hu P, Wang W (2018) Improved K-means algorithm and its implementation based on mean shift. In: 2018 11th International congress on image and signal processing, biomedical engineering and informatics (CISP-BMEI), pp 1–5. IEEE
Chiang WY (2018) Applying data mining for online CRM marketing strategy. Br Food J. https://doi.org/10.1108/BFJ-02-2017-0075
Christy AJ, Umamakeswari A, Priyatharsini L, Neyaa A (2018) RFM ranking—an effective approach to customer segmentation. J King Saud Univ Comput Inf Sci. https://doi.org/10.1016/j.jksuci.2018.09.004
Danesh M, Naghibzadeh M, Totonchi MRA, Danesh M, Minaei B, Shirgahi H (2011) Data clustering based on an efficient hybrid of K-harmonic means, PSO and GA. In: Transactions on computational collective intelligence IV, pp 125–140. Springer, Berlin, Heidelberg
Deng CH, Zhao WL (2018) Fast K-means based on k-NN Graph. In: 2018 IEEE 34th international conference on data engineering (ICDE), pp 1220–1223. IEEE
Dong G, Jin Y, Wang S, Li W, Tao Z, Guo S (2019) DB-K means: an intrusion detection algorithm based on DBSCAN and K-means. In: 2019 20th Asia-Pacific network operations and management symposium (APNOMS), pp 1–4. IEEE
Dyche J (2002) The CRM handbook: a business guide to customer relationship management. Addison-Wesley Professional, Boston
Erdil A, Öztürk A (2016) Improvement a quality oriented model for customer relationship management: a case study for shipment industry in Turkey. Procedia Soc Behav Sci 229:346–353
Erisoglu M, Calis N, Sakallioglu S (2011) A new algorithm for initial cluster centers in K-means algorithm. Pattern Recogn Lett 32(14):1701–1705
Eszergár-Kiss D, Caesar B (2017) Definition of user groups applying Ward’s method. Transp Res Procedia 22:25–34
Fadaei A, Khasteh SH (2019) Enhanced K-means re-clustering over dynamic networks. Expert Syst Appl 132:126–140
Feng Q, Zhu X, Pan JS (2015) Global linear regression coefficient classifier for recognition. Optik Int J Light Electron Opt 126(21):3234–3239
Fränti P, Sieranoja S (2019) How much can K-means be improved by using better initialization and repeats? Pattern Recogn 93:95–112
Gayathri A, Mohanavalli S (2011) Enhanced customer relationship management using fuzzy clustering. Int J Comput Sci Eng Technol 1(4):163–167
Govender P, Sivakumar V (2019) Application of K-means and hierarchical clustering techniques for analysis of air pollution: a review (1980–2019). Atmos Pollut Res
Gu Y, Li K, Guo Z, Wang Y (2019) Semi-supervised K-means DDoS detection method using hybrid feature selection algorithm. IEEE Access 7:64351–64365
Han J, Pei J, Kamber M (2011) Data mining: concepts and techniques. Elsevier, Amsterdam
He BH, Song GF (2009) Knowledge management and data mining for supply chain risk management. In: IEEE international conference on management and service science, 2009, pp 1–4
Hu J, Li M, Zhu E, Wang S, Liu X, Zhai Y (2019) Consensus multiple kernel K-means clustering with late fusion alignment and matrix-induced regularization. IEEE Access 7:136322–136331
Hussain SF, Haris M (2019) A K-means based co-clustering (kCC) algorithm for sparse, high dimensional data. Expert Syst Appl 118:20–34
Ismkhan H (2018) Ik-means−+: an iterative clustering algorithm based on an enhanced version of the K-means. Pattern Recogn 79:402–413
Jain AK, Murty MN, Flynn PJ (1999) Data clustering: a review. ACM Comput Surv 31(3):264–323
Jiang ZL, Guo N, Jin Y, Lv J, Wu Y, Liu Z, Fang J, Yiu SM, Wang X (2020) Efficient two-party privacy-preserving collaborative K-means clustering protocol supporting both storage and computation outsourcing. Inf Sci 518:168–180
Jones PJ, James MK, Davies MJ, Khunti K, Catt M, Yates T, Rowlands AV, Mirkes EM (2020) FilterK: a new outlier detection method for K-means clustering of physical activity. J Biomed Inf 103397:1–29
Kafashpour A, Tavakoli A, Alizadeh S (2012) Customers segmentation base on lifetime value, use RFM data mining. Iran J Public Manag 5(15):63–84
Kanungo T, Mount DM, Netanyahu NS, Piatko CD, Silverman R, Wu AY (2002) An efficient K-means clustering algorithm: analysis and implementation. IEEE Trans Pattern Anal Mach Intell 24(7):881–892
Karczmarek P, Kiersztyn A, Pedrycz W, Al E (2020) K-means-based isolation forest. Knowledge-Based Syst 105659:1–15
Katsavounidis I, Kuo CCJ, Zhang Z (1994) A new initialization technique for generalized Lloyd iteration. IEEE Signal Process Lett 1(10):144–146
Khalili-Damghani K, Abdi F, Abolmakarem S (2018) Hybrid soft computing approach based on clustering, rule mining, and decision tree analysis for customer segmentation problem: real case of customer-centric industries. Appl Soft Comput 73:816–828
Kumar KM, Reddy ARM (2017) An efficient K-means clustering filtering algorithm using density based initial cluster centers. Inf Sci 418:286–301
Kumar V, Shah D, Venkatesan R (2006) Managing retailer profitability—one customer at a time! J Retail 82(4):277–294
Lai JZ, Huang TJ (2010) Fast global K-means clustering using cluster membership and inequality. Pattern Recogn 43(5):1954–1963
Laudon KC, Laudon JP (2015) Management information systems: managing the digital firm plus MyMISLab with Pearson eText–access card package. Prentice Hall Press, Upper Saddle River
Li DC, Dai WL, Tseng WT (2011) A two-stage clustering method to analyze customer characteristics to build discriminative customer management: a case of textile manufacturing business. Expert Syst Appl 38(6):7186–7191
Li X, Qin B, Zhu Z, Lin Q (2017) Study on application of data mining in customer acquisition. In: DEStech transactions on social science, education and human science, (eemt)
Liao SH, Chu PH, Hsiao PY (2012) Data mining techniques and applications—a decade review from 2000 to 2011. Expert Syst Appl 39(12):11303–11311
Likas A, Vlassis N, Verbeek JJ (2003) The global K-means clustering algorithm. Pattern Recogn 36(2):451–461
Lin CY (2020) A reversible privacy-preserving clustering technique based on K-means algorithm. Appl Soft Comput 87:105995
MacQueen J (1967) Some methods for classification and analysis of multivariate observations. In: Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, vol. 1, no. 14, pp 281–297
Maghfirah MM, Adji TB, Setiawan NA (2015) Appropriate data mining technique and algorithm for using in analysis of customer relationship management (CRM) in bank industry. In: Seminar Nasional Aplikasi Teknologi Informasi (SNATI), vol. 1, no. 1
Manxi W, Liandong W, Chenfeng W, Xiaoguang G, Ruohai D (2018). Finding community structure of Bayesian networks by improved K-means algorithm. In: 2018 IEEE 3rd international conference on image, vision and computing (ICIVC), pp 865–869. IEEE
Maryani I, Riana D (2017) Clustering and profiling of customers using RFM for customer relationship management recommendations. In: IEEE 5th International Conference on Cyber and IT Service Management (CITSM), pp 1–6
Min Z, Kai-fei D (2015) Improved research to K-means initial cluster centers. In: 2015 Ninth international conference on frontier of computer science and technology, pp 349–353. IEEE
Mojena R (1977) Hierarchical grouping methods and stopping rules: an evaluation. Comput J 20(4):359–363
Mukhlas A, Ahmad A, Zainun Z Berhad MP (2016) Data mining technique: towards supporting local co-operative society in customer profiling, market analysis and prototype construction. In: IEEE international conference on information and communication technology, pp 109–114
Nguyen B, De Baets B (2019) Kernel-based distance metric learning for supervised K-means clustering. IEEE Trans Neural Netw Learn Syst 30(10):3084–3095
Nithya A, Appathurai A, Venkatadri N, Ramji DR, Palagan CA (2020) Kidney disease detection and segmentation using artificial neural network and multi-kernel K-means clustering for ultrasound images. Measurement 149:106952
Olson DL (2017) Recency frequency and monetary model. In: Descriptive data mining. Springer, Singapore
Pawar RG (2016) Data mining: techniques for enhancing customer relationship management in fast moving consumer goods industries. Int Res J Multidiscip Stud 2(2):1–5
Peker S, Kocyigit A, Eren PE (2017) LRFMP model for customer segmentation in the grocery retail industry: a case study. Market Intell Plan 35(4):544–559
Prabha D, Subramanian RS (2017) A survey on customer relationship management. In: 4th IEEE international conference on advanced computing and communication systems (ICACCS), pp 1–5
Qadadeh W, Abdallah S (2018) Customers segmentation in the insurance company (TIC) dataset. Procedia Comput Sci 144:277–290
Qiao J, Cai X, Xiao Q, Chen Z, Kulkarni P, Ferris C, Kamarthi S, Sridhar S (2019) Data on MRI brain lesion segmentation using K-means and Gaussian mixture model-expectation maximization. Data Brief 27:104628
Rajeh SM, Koudehi FA, Seyedhosseini SM, Farazmand R (2014) A model for customer segmentation based on loyalty using data mining approach and fuzzy concept in Iranian Bank. Int J Bus Behav Sci 4(9):118–136
Redmond SJ, Heneghan C (2007) A method for initialising the K-means clustering algorithm using kd-trees. Pattern Recogn Lett 28(8):965–973
Riveros NAM, Espitia BAC, Pico LEA (2019) Comparison between K-means and self-organizing maps algorithms used for diagnosis spinal column patients. Inform Med Unlocked 16:100206
Schubert E, Sander J, Ester M, Kriegel HP, Xu X (2017) DBSCAN revisited, revisited: why and how you should (still) use DBSCAN. ACM Trans Database Syst 42(3):19
Sharma V, Bala M (2020) An improved task allocation strategy in cloud using modified K-means clustering technique. Egypt Inform J. https://doi.org/10.1016/j.eij.2020.02.001
Shatnawi MQ, Yassein MB, Al-natour H (2017) Customer relationship management at Jordan University of science and technology: case study, issues and recommendations. In: IEEE international conference on engineering and technology (ICET), pp 1–6. IEEE
Shmueli G, Bruce PC, Yahav I, Patel NR, Lichtendahl KC Jr (2017) Data mining for business analytics: concepts, techniques, and applications in R. Wiley, Hoboken
Sohrabi J, Hadavandi E (2011) Data mining in banking industry. Iranian Jahad Publishing, Amir Kabir University of Technology, Tehran, pp 25–70
Subbalakshmi C, Krishna GR, Rao SKM, Rao PV (2015) A Method to find optimum number of clusters based on fuzzy silhouette on dynamic data set. Procedia Comput Sci 46:346–353
Szekely GJ, Rizzo ML (2005) Hierarchical clustering via joint between-within distances: extending ward’s minimum variance method. J Classif 22(2)
Szulanski G (1996) Exploring internal stickiness: impediments to the transfer of best practice within the firm. Strateg Manag J 17(S2):27–43
Tzortzis G, Likas A (2014) The MinMax K-means clustering algorithm. Pattern Recogn 47(7):2505–2516
Wang H, Zhang J (2010) Study of customer segmentation for auto services companies based on RFM model. School of Management, Wuhan University of Technology, Wuhan
Wang S, Zhu E, Hu J, Li M, Zhao K, Hu N, Liu X (2019) Efficient multiple kernel K-means clustering with late fusion. IEEE Access 7:61109–61120
Xiaofeng Z, Xiaohong H (2017) Research on intrusion detection based on improved combination of K-means and multi-level SVM. In: 2017 IEEE 17th international conference on communication technology (ICCT), pp 2042–2045. IEEE
Khajvand M, Tarokh MJ (2011) Analyzing customer segmentation based on customer value components (case study: a private bank)
Yu SS, Chu SW, Wang CM, Chan YK, Chang TC (2018) Two improved K-means algorithms. Appl Soft Comput 68:747–755
Yuliari NPP, Putra IKGD, Rusjayanti NKD (2015) Customer segmentation through fuzzy C-means and fuzzy RFM method. J Theor Appl Inf Technol 78(3):380–385
Zahrotun L (2017) Implementation of data mining technique for customer relationship management (CRM) on online shop tokodiapers.com with fuzzy c-means clustering. In: IEEE 2nd international conferences on information technology, information systems and electrical engineering (ICITISEE), pp 299–303
Zhang GY, Wang CD, Huang D, Zheng WS, Zhou YR (2018) TW-Co-K-means: two-level weighted collaborative K-means for multi-view clustering. Knowl Based Syst 150:127–138
Žiberna A (2020) K-means-based algorithm for blockmodeling linked networks. Soc Netw 61:153–169
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that there is no conflict of interest regarding the publication of this paper.
Human and animal rights
This article does not contain any studies with human participants or animals performed by the authors.
Additional information
Communicated by V. Loia.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Zare, H., Emadi, S. Determination of Customer Satisfaction using Improved K-means algorithm. Soft Comput 24, 16947–16965 (2020). https://doi.org/10.1007/s00500-020-04988-4
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-020-04988-4