Abstract
In recent years, as digital transformation picked up stream, the volume of customer transactional data that become available to companies has increased. By making use of such vast amount of transactional data and employing various data mining techniques, customer segmentation has received intensive attention from different industries, while significant research effort has been devoted to this topic, and the body of literature has begun to accumulate. In this context, the aim of this paper is to provide a comprehensive review of literature on transactional data-based customer segmentation to identify different characteristics in the field, analyze the application of data mining techniques, and highlight important points for further research. To review the existing literature in the field, three major online databases were used, and eventually, 84 relevant articles published in journals of well-known publishers are selected. The identified articles then completely analyzed based on the diverse criteria of the stages of CRISP-DM (CRoss Industry Standard Process for Data Mining) framework, and the results were reported. This systematic literature review can be very useful for academics and practitioners by providing a comprehensive overview of research work on customer segmentation using data mining and presenting guidelines for future research in this area as well.
Similar content being viewed by others
Data Availability
All relevant data and material are presented in the main paper.
References
Abbasimehr H, Shabani M (2019) A new methodology for customer behavior analysis using time series clustering: A case study on a bank’s customers. Kybernetes. https://doi.org/10.1108/K-09-2018-0506
Abbasimehr H, Shabani M (2021) A new framework for predicting customer behavior in terms of RFM by considering the temporal aspect based on time series techniques. J Ambient Intell Humaniz Comput 12:515–531
Abbasimehr H, Setak M, Soroor J (2013) A framework for identification of high-value customers by including social network based variables for churn prediction using neuro-fuzzy techniques. Int J Prod Res 51:1279–1294. https://doi.org/10.1080/00207543.2012.707342
Abdi F, Abolmakarem S (2019) Customer Behavior Mining Framework (CBMF) using clustering and classification techniques. J Ind Eng Int. https://doi.org/10.1007/s40092-018-0285-3
Akhondzadeh-Noughabi E, Albadvi A (2015) Mining the dominant patterns of customer shifts between segments by using top-k and distinguishing sequential rules. Manag Decis 53:1976–2003. https://doi.org/10.1108/MD-09-2014-0551
Alborzi M, Khanbabaei M (2016) Using data mining and neural networks techniques to propose a new hybrid customer behaviour analysis and credit scoring model in banking services based on a developed RFM analysis method. Int J Bus Inf Syst. https://doi.org/10.1504/IJBIS.2016.078020
Aminnayeri M, Golsefid SMM (2012) An international market segmentation based on combined trade value case study: Iran international furniture market. Int J Bus Compet Growth 2:357. https://doi.org/10.1504/ijbcg.2012.049792
Anitha P, Patil MM (2020) RFM model for customer purchase behavior using K-Means algorithm. J King Saud Univ - Comput Inf Sci. https://doi.org/10.1016/j.jksuci.2019.12.011
Apichottanakul A, Goto M, Piewthongngam K, Pathumnakul S (2021) Customer behaviour analysis based on buying-data sparsity for multi-category products in pork industry: A hybrid approach. Cogent Eng 8:1865598
Bai Y, Jia S, Wang S, Tan B (2020) Customer loyalty improves the effectiveness of recommender systems based on complex network. Information, 11(3):171. https://doi.org/10.3390/INFO11030171
Berger IE, Cunningham PH, Drumwright ME (2006) Identity, identification, and relationship through social alliances. J Acad Mark Sci. https://doi.org/10.1177/0092070305284973
Böttcher M, Spott M, Nauck D, Kruse R (2009) Mining changing customer segments in dynamic markets. Expert Syst Appl. https://doi.org/10.1016/j.eswa.2007.09.006
Cao J, Yu X, Zhang Z (2015) Integrating OWA and data mining for analyzing customers churn in E-commerce. J Syst Sci Complex 28:381–392. https://doi.org/10.1007/s11424-015-3268-0
Chan CCH (2008) Intelligent value-based customer segmentation method for campaign management: A case study of automobile retailer. Expert Syst Appl 34:2754–2762. https://doi.org/10.1016/j.eswa.2007.05.043
Chan SL, Ip WH (2011) A dynamic decision support system to predict the value of customer for new product development. Decis Support Syst. https://doi.org/10.1016/j.dss.2011.07.002
Chan CCH, Bin CC, Hsien WC (2011) Pricing and promotion strategies of an online shop based on customer segmentation and multiple objective decision making. Expert Syst Appl 38:14585–14591. https://doi.org/10.1016/j.eswa.2011.05.024
Chan CCH, Hwang YR, Wu HC (2016) Marketing segmentation using the particle swarm optimization algorithm: a case study. J Ambient Intell Humaniz Comput. https://doi.org/10.1007/s12652-016-0389-9
Chang HH, Tsay SF (2004) Integrating of SOM and K-mean in data mining clustering: An empirical study of CRM and profitability evaluation
Chang HH, Ku PW (2009) Implementation of relationship quality for CRM performance: Acquisition of BPR and organisational learning. Total Qual Manag Bus Excell. https://doi.org/10.1080/14783360902719758
Chao S-H, Chen M-K, Wu H-H (2021) An empirical study of hospital’s outpatient loyalty from a medical center in Taiwan. SAGE Open 11:21582440211004124
Chen RY (2009) RFM-based eco-efficiency analysis using Takagi-Sugeno fuzzy and AHP approach. Environ Impact Assess Rev 29:157–164. https://doi.org/10.1016/j.eiar.2008.11.001
Chen D, Sain SL, Guo K (2012a) Data mining for the online retail industry: A case study of RFM model-based customer segmentation using data mining. J Database Mark Cust Strateg Manag 19:197–208. https://doi.org/10.1057/dbm.2012.17
Chen YS, Cheng CH, Lai CJ et al (2012b) Identifying patients in target customer segments using a two-stage clustering-classification approach: A hospital-based assessment. Comput Biol Med 42:213–221. https://doi.org/10.1016/j.compbiomed.2011.11.010
Chen D, Guo K, Ubakanma G (2015) Predicting customer profitability over time based on RFM time series. Int J Bus Forecast Mark Intell 2:1. https://doi.org/10.1504/ijbfmi.2015.075325
Cheng CH, Chen YS (2009) Classifying the segmentation of customer value via RFM model and RS theory. Expert Syst Appl 36:4176–4184. https://doi.org/10.1016/j.eswa.2008.04.003
Chiu CY, Kuo IT (2010) Applying particle swarm optimization and honey bee mating optimization in developing an intelligent market segmentation system. J Syst Sci Syst Eng 19:182–191. https://doi.org/10.1007/s11518-010-5135-9
Chiu CY, Chen YF, Kuo IT, Ku HC (2009) An intelligent market segmentation system using k-means and particle swarm optimization. Expert Syst Appl 36:4558–4565. https://doi.org/10.1016/j.eswa.2008.05.029
Chiu CY, Ku HC, Kuo IT, Shih PC (2014) Customer information system using fuzzy query and cluster analysis. J Ind Prod Eng 31:134–145. https://doi.org/10.1080/21681015.2014.914106
Chuang SH, Lin HN (2013) The roles of infrastructure capability and customer orientation in enhancing customer-information quality in CRM systems: Empirical evidence from Taiwan. Int J Inf Manage. https://doi.org/10.1016/j.ijinfomgt.2012.12.003
DehghaniZadeh MR, Fathian M, Gholamian MR (2018) LDcFR: A new model to determine value of airline passengers. Tour Hosp Res. https://doi.org/10.1177/1467358416663821
Dhandayudam P, Krishnamurthi I (2014) Rough set approach for characterizing customer behavior. Arab J Sci Eng 39:4565–4576. https://doi.org/10.1007/s13369-014-1013-y
Djurisic V, Kascelan L, Rogic S, Melovic B (2020) Bank CRM optimization using predictive classification based on the support vector machine method. Appl Artif Intell. https://doi.org/10.1080/08839514.2020.1790248
Drozdenko RG, Drake PD (2002) Optimal database marketing: Strategy, development, and data mining. Sage
Dursun A, Caber M (2016) Using data mining techniques for profiling profitable hotel customers: An application of RFM analysis. Tour Manag Perspect. https://doi.org/10.1016/j.tmp.2016.03.001
Garfield E (1979) Is citation analysis a legitimate evaluation tool? Scientometrics. https://doi.org/10.1007/BF02019306
Güçdemir H, Selim H (2015) Integrating multi-criteria decision making and clustering for business customer segmentation. Ind Manag Data Syst. https://doi.org/10.1108/IMDS-01-2015-0027
Gülcü A, Çalişkan S (2020) Clustering electricity market participants via FRM models. Intell Decis Technol. https://doi.org/10.3233/IDT-200092
Guney S, Peker S, Turhan C (2020) A combined approach for customer profiling in video on demand services using clustering and association rule mining. IEEE Access 8 84326-8433. https://doi.org/10.1109/ACCESS.2020.2992064
Ha SH (2007) Applying knowledge engineering techniques to customer analysis in the service industry. Adv Eng Informatics 21:293–301. https://doi.org/10.1016/j.aei.2006.12.001
Ha SH, Park SC (1998) Application of data mining tools to hotel data mart on the Intranet for database marketing. Expert Syst Appl. https://doi.org/10.1016/S0957-4174(98)00008-6
Haghighatnia S, Abdolvand N, Rajaee Harandi S (2018) Evaluating discounts as a dimension of customer behavior analysis. J Mark Commun. https://doi.org/10.1080/13527266.2017.1410210
Hajipour B, Esfahani M (2019) Delta model application for developing customer lifetime value. Mark Intell Plan. https://doi.org/10.1108/MIP-06-2018-0190
Halkidi M, Batistakis Y, Vazirgiannis M (2001) On clustering validation techniques. J Intell Inf Syst. https://doi.org/10.1023/A:1012801612483
Hosseini M, Shabani M (2015) New approach to customer segmentation based on changes in customer value. J Mark Anal 3:110–121. https://doi.org/10.1057/jma.2015.10
Hosseini SMS, Maleki A, Gholamian MR (2010) Cluster analysis using data mining approach to develop CRM methodology to assess the customer loyalty. Expert Syst Appl 37:5259–5264
Hsu PY, Huang CW (2020) IECT: A methodology for identifying critical products using purchase transactions. Appl Soft Comput J. https://doi.org/10.1016/j.asoc.2020.106420
Hughes AM (1996) Boosting Response with RFM. Mark Tools 3:4–10
Jacobs A (2009) The pathologies of big data. Communications of the ACM 52(8): 36-44 https://doi.org/10.1145/1536616.1536632
Jintana J, Sopadang A, Ramingwong S (2020) Matching consignees/shippers recommendation system in courier service using data analytics. Applied Sciences 10(16): 5585.. https://doi.org/10.3390/app10165585
Kao Y-T, Wu H-H, Chen H-K, Chang E-C (2011) A case study of applying LRFM model and clustering techniques to evaluate customer values. J Stat Manag Syst 14:267–276
Keramati A, Mehrabi H, Mojir N (2010) A process-oriented perspective on customer relationship management and organizational performance: An empirical investigation. Ind Mark Manag. https://doi.org/10.1016/j.indmarman.2010.02.001
Khobzi H, Akhondzadeh-Noughabi E, Minaei-Bidgoli B (2014) A new application of RFM clustering for guild segmentation to mine the pattern of using banks’ e-payment services. J Glob Mark 27:178–190. https://doi.org/10.1080/08911762.2013.878428
Kitchenham B (2004) Procedures for performing systematic reviews. Keele University 33
Koch R (2011) The 80/20 principle: The secret of achieving more with less: Updated 20th anniversary edition of the productivity and business classic. Hachette UK
Kotler P, Keller KL (2013) Marketing Management 14e. Pearson Education Limited
Li DC, Dai WL, Tseng WT (2011) A two-stage clustering method to analyze customer characteristics to build discriminative customer management: A case of textile manufacturing business. Expert Syst Appl 38:7186–7191. https://doi.org/10.1016/j.eswa.2010.12.041
Liang YH (2010) Integration of data mining technologies to analyze customer value for the automotive maintenance industry. Expert Syst Appl 37:7489–7496. https://doi.org/10.1016/j.eswa.2010.04.097
Lingras P, Hogo M, Snorek M, West C (2005) Temporal analysis of clusters of supermarket customers: Conventional versus interval set approach. Inf Sci (ny) 172:215–240. https://doi.org/10.1016/j.ins.2004.12.007
Liu DR, Shih YY (2005a) Integrating AHP and data mining for product recommendation based on customer lifetime value. Inf Manag 42:387–400. https://doi.org/10.1016/j.im.2004.01.008
Liu DR, Shih YY (2005b) Hybrid approaches to product recommendation based on customer lifetime value and purchase preferences. J Syst Softw 77:181–191. https://doi.org/10.1016/j.jss.2004.08.031
Lu TC, Wu KY (2009) A transaction pattern analysis system based on neural network. Expert Syst Appl. https://doi.org/10.1016/j.eswa.2008.07.073
Mahjoub RH, Afsar A (2019) A hybrid model for customer credit scoring in stock brokerages using data mining approach. Int J Bus Inf Syst. https://doi.org/10.1504/IJBIS.2019.100279
Martínez RG, Carrasco RA, Sanchez-Figueroa C, Gavilan D (2021) An RFM model customizable to product catalogues and marketing criteria using fuzzy linguistic models: Case study of a retail business. Mathematics 9:1836
Martinez-Plumed F, Contreras-Ochando L, Ferri C et al (2019) CRISP-DM twenty years later: From data mining processes to data science trajectories. IEEE Trans Knowl Data Eng. https://doi.org/10.1109/tkde.2019.2962680
McDonald M, Dunbar I (2004) Market segmentation: How to do it, how to profit from it. Butterworth-Heinemann
Miguéis VL, Camanho AS, Falcão E, Cunha J (2012) Customer data mining for lifestyle segmentation. Expert Syst Appl. https://doi.org/10.1016/j.eswa.2012.02.133
Mojena R (1977) Hierarchical grouping methods and stopping rules: an evaluation. Comput J 20(4):359–363
Momtaz NJ, Alizadeh S, Vaghefi MS (2013) A new model for assessment fast food customer behavior case study: An Iranian fast-food restaurant. Br Food J 115:601–613. https://doi.org/10.1108/00070701311317874
Munusamy S, Murugesan P (2020) Modified dynamic fuzzy c-means clustering algorithm – Application in dynamic customer segmentation. Appl Intell 50:1922–1942. https://doi.org/10.1007/s10489-019-01626-x
Namvar M, Khakabimamaghani S, Gholamian MR (2011) An approach to optimised customer segmentation and profiling using RFM, LTV, and demographic features. Int J Electron Cust Relatsh Manag 5:220–235. https://doi.org/10.1504/IJECRM.2011.044688
Ngai EWT, Xiu L, Chau DCK (2009) Application of data mining techniques in customer relationship management: A literature review and classification. Expert Syst Appl 36:2592–2602
Nikaein N, Abedin E (2021) Customers’ segmentation in pharmaceutical distribution industry based on the RFML model. Int J Bus Inf Syst 37:29–44
Niknam T, Amiri B (2010) An efficient hybrid approach based on PSO, ACO and k-means for cluster analysis. Appl Soft Comput J 10:183–197. https://doi.org/10.1016/j.asoc.2009.07.001
Nikumanesh E, Albadvi A (2014) Customer’s life-time value using the RFM model in the banking industry: A case study. Int J Electron Cust Relatsh Manag 8:15–30. https://doi.org/10.1504/IJECRM.2014.066876
Peker S, Kart Ö (2022) A machine learning framework for data-driven CRM. In: Kahyaoğlu SB (ed) The impact of artificial intelligence on governance, economics and finance, vol 2. Springer, Singapore, pp 87–103
Peker S, Kocyigit A, Eren PE (2017a) LRFMP model for customer segmentation in the grocery retail industry: a case study. Marketing Intelligence & Planning 35(4): 544-559 https://doi.org/10.1108/MIP-11-2016-0210
Peker S, Kocyigit A, Eren PE (2017b) A hybrid approach for predicting customers’ individual purchase behavior. Kybernetes 46(10): 1614-1631. https://doi.org/10.1108/K-05-2017-0164
Pete C, Julian C, Randy K, et al (2000) Crisp-Dm 1.0. Cris Consort
Rahim MA, Mushafiq M, Khan S, Arain ZA (2021) RFM-based repurchase behavior for customer classification and segmentation. J Retail Consum Serv 61:102566
Reynolds KL, Harris LC (2006) Deviant customer behavior: An exploration of frontline employee tactics. J Mark Theory Pract. https://doi.org/10.2753/MTP1069-6679140201
Rezaeinia SM, Rahmani R (2016) Recommender system based on customer segmentation (RSCS). Kybernetes. https://doi.org/10.1108/K-07-2014-0130
Rezaeinia SM, Keramati A, Albadvi A (2012) An integrated AHP-RFM method to banking customer segmentation. Int J Electron Cust Relatsh Manag 6:153–168. https://doi.org/10.1504/IJECRM.2012.048721
Safari F, Safari N, Montazer GA (2016) Customer lifetime value determination based on RFM model. Mark Intell Plan. https://doi.org/10.1108/MIP-03-2015-0060
Sajjadi K, Khatami-Firuzabadi MA, Amiri M, Sadaghiani JS (2015) A developing model for clustering and ranking bank customers. Int J Electron Cust Relatsh Manag 9:73–86. https://doi.org/10.1504/IJECRM.2015.070701
Sarvari PA, Ustundag A, Takci H (2016) Performance evaluation of different customer segmentation approaches based on RFM and demographics analysis. Kybernetes. https://doi.org/10.1108/K-07-2015-0180
Sheikh A, Ghanbarpour T, Gholamiangonabadi D (2019) A preliminary study of fintech industry: A two-stage clustering analysis for customer segmentation in the B2B setting. J Business-to-Bus Mark. https://doi.org/10.1080/1051712X.2019.1603420
Shih Y-Y, Liu C-Y (2003) A method for customer lifetime value ranking — Combining the analytic hierarchy process and clustering analysis. J Database Mark Cust Strateg Manag 11:159–172. https://doi.org/10.1057/palgrave.dbm.3240216
Shih YY, Liu DR (2008) Product recommendation approaches: Collaborative filtering via customer lifetime value and customer demands. Expert Syst Appl 35:350–360. https://doi.org/10.1016/j.eswa.2007.07.055
Shokouhyar S, Shokoohyar S, Safari S (2020) Research on the influence of after-sales service quality factors on customer satisfaction. J Retail Consum Serv. https://doi.org/10.1016/j.jretconser.2020.102139
Smith WR (1956) Product differentiation and market segmentation as alternative marketing strategies. Journal of marketing 21(1): 3–8
Song M, Zhao X, E H, Ou Z (2017) Statistics-based CRM approach via time series segmenting RFM on large scale data. Knowledge-Based Syst.https://doi.org/10.1016/j.knosys.2017.05.027
Tarokh MJ, EsmaeiliGookeh M (2019) Modeling patient’s value using a stochastic approach: An empirical study in the medical industry. Comput Methods Programs Biomed. https://doi.org/10.1016/j.cmpb.2019.04.021
Tsai CY, Chiu CC (2004) A purchase-based market segmentation methodology. Expert Syst Appl 27:265–276. https://doi.org/10.1016/j.eswa.2004.02.005
Tsai CF, Hu YH, Hung CS, Hsu YF (2013) A comparative study of hybrid machine learning techniques for customer lifetime value prediction. Kybernetes 42:357–370. https://doi.org/10.1108/03684921311323626
Verhoef PC, Donkers B (2001) Predicting customer potential value an application in the insurance industry. Decis Support Syst. https://doi.org/10.1016/S0167-9236(01)00110-5
Wang CH (2009) Outlier identification and market segmentation using kernel-based clustering techniques. Expert Syst Appl 36:3744–3750. https://doi.org/10.1016/j.eswa.2008.02.037
Wang CH (2010) Apply robust segmentation to the service industry using kernel induced fuzzy clustering techniques. Expert Syst Appl 37:8395–8400. https://doi.org/10.1016/j.eswa.2010.05.042
Wang S-C, Tsai Y-T, Ciou Y-S (2020) A hybrid big data analytical approach for analyzing customer patterns through an integrated supply chain network. J Ind Inf Integr. https://doi.org/10.1016/j.jii.2020.100177
Wei JT, Lin SY, Weng CC, Wu HH (2012) A case study of applying LRFM model in market segmentation of a children’s dental clinic. Expert Syst Appl 39:5529–5533. https://doi.org/10.1016/j.eswa.2011.11.066
Wei JT, Lee MC, Chen HK, Wu HH (2013) Customer relationship management in the hairdressing industry: An application of data mining techniques. Expert Syst Appl 40:7513–7518. https://doi.org/10.1016/j.eswa.2013.07.053
Wei JT, Lin S-Y, Yang Y-Z, Wu H-H (2019) The application of data mining and RFM model in market segmentation of a veterinary hospital. J Stat Manag Syst. https://doi.org/10.1080/09720510.2019.1565445
Weng SS, Liu MJ (2004) Feature-based recommendations for one-to-one marketing. Expert Syst Appl 26:493–508. https://doi.org/10.1016/j.eswa.2003.10.008
Weng SS, Wang BJ, Chiu RK, Su SH (2006) The study and verification of mathematical modeling for customer purchasing behavior. J Comput Inf Syst 47:46–57. https://doi.org/10.1080/08874417.2007.11645953
White C, Yu YT (2005) Satisfaction emotions and consumer behavioral intentions. J Serv Mark 19(6): 411–420
Witten IH, Frank E, Hall MA, Pal CJ (2016) Data Mining: Practical machine learning tools and techniques. Morgan Kaufmann
Wu HH, Lin SY, Liu CW (2014) Analyzing patients’ values by applying cluster analysis and LRFM model in a pediatric dental clinic in Taiwan. Sci World J. https://doi.org/10.1155/2014/685495
Wu J, Shi L, Lin WP et al (2020) An empirical study on customer segmentation by purchase behaviors using a RFM Model and K -means algorithm. Math Probl Eng. https://doi.org/10.1155/2020/8884227
Wu J, Shi L, Yang L et al (2021) User value ıdentification based on ımproved RFM model and-means++ algorithm for complex data analysis. Wirel Commun Mob Comput 2021:1–8
Yoseph F, Ahamed Hassain Malim NH, Heikkilä M et al (2020) The impact of big data market segmentation using data mining and clustering techniques. J Intell Fuzzy Syst 38:6159–6173. https://doi.org/10.3233/JIFS-179698
You Z, Si YW, Zhang D et al (2015) A decision-making framework for precision marketing. Expert Syst Appl 42:3357–3367. https://doi.org/10.1016/j.eswa.2014.12.022
Zare H, Emadi S (2020) Determination of Customer Satisfaction using Improved K-means algorithm. Soft Comput 24:16947–16965. https://doi.org/10.1007/s00500-020-04988-4
Zhou J, Wei J, Xu B (2021) Customer segmentation by web content mining. J Retail Consum Serv 61:102588
Zong Y, Xing H (2021) Customer stratification theory and value evaluation—analysis based on improved RFM model. J Intell Fuzzy Syst 40:4155–4167
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Competing interests
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Additional information
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Peker, S., Kart, Ö. Transactional data-based customer segmentation applying CRISP-DM methodology: A systematic review. J. of Data, Inf. and Manag. 5, 1–21 (2023). https://doi.org/10.1007/s42488-023-00085-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s42488-023-00085-x