Case studies on using natural language processing techniques in customer relationship management software


How can we use a text corpus stored in a customer relationship management (CRM) database for data mining and segmentation? To answer this question, we inherited the state of the art methods commonly used in natural language processing (NLP) literature, such as word embeddings, and deep learning literature, such as recurrent neural networks (RNN). We used the text notes from a CRM system taken by customer representatives of an internet ads consultancy agency between 2009 and 2020. We trained word embeddings by using the corresponding text corpus and showed that these word embeddings could be used directly for data mining and used in RNN architectures, which are deep learning frameworks built with long short-term memory (LSTM) units, for more comprehensive segmentation objectives. The obtained results prove that we can use structured text data populated in a CRM to mine valuable information. Hence, any CRM can be equipped with useful NLP features once we correctly built the problem definitions and conveniently implement the solution methods.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10


  1. 1.

    In this study, the procedures and principles introduced by the legal regulations that the company is subject to for the protection of personal data, and the company’s privacy policy, which is notified to the customers, were followed.


  1. Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G.S., Davis, A., Dean, J., Devin, M., Ghemawat, S., Goodfellow, I., Harp, A., Irving, G., Isard, M., Jia, Y., Jozefowicz, R., Kaiser, L., Kudlur, M., Levenberg, J., Mané, D., Monga, R., Moore, S., Murray, D., Olah, C., Schuster, M., Shlens, J., Steiner, B., Sutskever, I., Talwar, K., Tucker, P., Vanhoucke, V., Vasudevan, V., Viégas, F., Vinyals, O., Warden, P., Wattenberg, M., Wicke, M., Yu, Y., & Zheng, X. (2015). TensorFlow: Large-scale machine learning on heterogeneous systems Software available from

  2. Bahari, T.F., & Elayidom, M.S. (2015). An efficient CRM-data mining framework for the prediction of customer behaviour. Procedia Computer Science, 46, 725–731.

    Google Scholar 

  3. Bates, M. (1995). Models of natural language understanding. Proceedings of the National Academy of Sciences, 92(22), 9977–9982.

    Google Scholar 

  4. Bolukbasi, T., Chang, K.W., Zou, J.Y., Saligrama, V., & Kalai, A.T. (2016). Man is to computer programmer as woman is to homemaker? debiasing word embeddings. In Lee, D. D., Sugiyama, M., Luxburg, U. V., Guyon, I., & Garnett, R. (Eds.) Advances in neural information processing systems 29 (pp. 4349–4357): Curran Associates, Inc.

  5. Oliphant, T.E. (2006). A guide to NumPy, USA: Trelgol Publishing.

  6. Jones, E., Oliphant, T., & Peterson, P. (2001). SciPy: Open Source Scientific Tools for Python.

  7. Feinberg, R.A., Kim, I., Hokama, L., de Ruyter, K., & Keen, C. (2000). Operational determinants of caller satisfaction in the call center. International Journal of Service Industry Management, 11(2), 131–141.

    Google Scholar 

  8. Gupta, G., Aggarwal, H., & Rani, R. (2016). Segmentation of retail customers based on cluster analysis in building successful CRM. International Journal of Business Information Systems, 23(2), 212.

    Google Scholar 

  9. Gupta, S.T., Sahoo, J.K., & Roul, R.K. (2019). Authorship identification using recurrent neural networks. In Proceedings of the 2019 3rd international conference on information system and data mining - ICISDM 2019. ACM Press.

  10. Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 9(8), 1735–1780.

    Google Scholar 

  11. Jiang, Z., Li, L., Huang, D., & Liuke, J. (2015). Training word embeddings for deep learning in biomedical text mining tasks. In 2015 IEEEx International conference on bioinformatics and biomedicine (BIBM), pp 625–628.

  12. Jurafsky, D. (2019). Speech and language processing an introduction to natural language processing, computational linguistics, and speech recognition 3rd edition draft.

  13. Karpathy, A. (2015). The unreasonable effectiveness of recurrent neural networks.

  14. Kingma, D.P., Ba, J., Bengio, Y., & LeCun, Y. (2015). Adam: A method for stochastic optimization. In 3rd International conference on learning representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, conference track proceedings. 1412.6980.

  15. Leglaive, S., Hennequin, R., & Badeau, R. (2015). Singing voice detection with deep recurrent neural networks. In 2015 IEEE International conference on acoustics, speech and signal processing (ICASSP), pp 121–125.

  16. McKinney, W. (2011). Pandas: a foundational python library for data analysis and statistics.

  17. Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., & Dean, J. (2013). Distributed representations of words and phrases and their compositionality. In Burges, C.J.C., Bottou, L., Welling, M., Ghahramani, Z., & Weinberger, K. Q. (Eds.) Advances in neural information processing systems 26 (pp. 3111–3119): Curran Associates, Inc.

  18. Mueller, A. Python word cloud library.

  19. Müller, J.M., Pommeranz, B., Weisser, J., & Voigt, K.I. (2018). Digital, social media, and mobile marketing in industrial buying: Still in need of customer segmentation? empirical evidence from Poland and Germany. Industrial Marketing Management, 73, 70–83.

    Google Scholar 

  20. Nair, V., & Hinton, G.E. (2010). Rectified linear units improve restricted boltzmann machines. In Proceedings of the 27th international conference on international conference on machine learning, ICML’10 (pp. 807–814). USA: Omnipress, Madison, WI.

  21. Nowak, J., Taspinar, A., & Scherer, R. (2017). Lstm recurrent neural networks for short text and sentiment classification. In Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., & Zurada, J.M. (Eds.) Artificial intelligence and soft computing (pp. 553–562). Cham: Springer International Publishing.

  22. Ozan, S. (2018). A case study on customer segmentation by using machine learning methods. In 2018 International conference on artificial intelligence and data processing (IDAP), IEEE.

  23. Ozan, S., & Iheme, L.O. (2019). Artificial neural networks in customer segmentation. In 2019 27Th signal processing and communications applications conference (SIU), IEEE.

  24. Pennington, J., Socher, R., & Manning, C.D. (2014). Glove: Global vectors for word representation. In Empirical methods in natural language processing (EMNLP), pp 1532–1543.

  25. Rehman, A., Naz, S., & Razzak, M.I. (2018). Writer identification using machine learning approaches: a comprehensive review. Multimedia Tools and Applications, 78(8), 10889–10931.

    Google Scholar 

  26. Rehurek, R., & Sojka, P. (2010). Software framework for topic modelling with large corpora. In Proceedings of the LREC 2010 workshop on new challenges for NLP frameworks (pp. 45–50). Malta: ELRA, Valletta.

  27. Rossum, G. (1995). Python reference manual. Tech. rep., Amsterdam, The Netherlands The Netherlands.

  28. Sarvari, P.A., Ustundag, A., & Takci, H. (2016). Performance evaluation of different customer segmentation approaches based on RFM and demographics analysis. Kybernetes, 45(7), 1129–1157.

    Google Scholar 

  29. Schuster, M., & Paliwal, K.K. (1997). Bidirectional recurrent neural networks. IEEE Transactions on Signal Processing, 45(11), 2673–2681.

    Google Scholar 

  30. Thakur, R., & Workman, L. (2016). Customer portfolio management (cpm) for improved customer relationship management (crm): Are your customers platinum, gold, silver, or bronze?. Journal of Business Research, 69(10), 4095–4102.

    Google Scholar 

  31. Tsiptsis, K., & Chorianopoulos, A. (2010). Data mining techniques in CRM: Inside customer segmentation. Hoboken: Wiley Publishing.

    Google Scholar 

  32. Wang, J.H., Liu, T.W., Luo, X., & Wang, L. (2018). An LSTM approach to short text sentiment classification with word embeddings. In Proceedings of the 30th conference on computational linguistics and speech processing (ROCLING 2018), pp. 214–223. the association for computational linguistics and chinese language processing (ACLCLP), Hsinchu, Taiwan.

  33. Windler, K., Jüttner, U., Michel, S., Maklan, S., & Macdonald, E.K. (2017). Identifying the right solution customers: A managerial methodology. Industrial Marketing Management, 60, 173–186.

    Google Scholar 

  34. Wu, Y., Schuster, M., Chen, Z., Le, Q.V., Norouzi, M., Macherey, W., Krikun, M., Cao, Y., Gao, Q., Macherey, K., Klingner, J., Shah, A., Johnson, M., Liu, X., Kaiser, ł., Gouws, S., Kato, Y., Kudo, T., Kazawa, H., Stevens, K., Kurian, G., Patil, N., Wang, W., Young, C., Smith, J., Riesa, J., Rudnick, A., Vinyals, O., Corrado, G., Hughes, M., & Dean, J. (2016). Google’s neural machine translation system: Bridging the gap between human and machine translation.arXiv:1609.08144.

  35. Yao, Y., Rosasco, L., & Caponnetto, A. (2007). On early stopping in gradient descent learning. Constructive Approximation, 26(2), 289–315.

    MathSciNet  MATH  Google Scholar 

Download references

Author information




All the related work was carried out by the corresponding author.

Corresponding author

Correspondence to Şükrü Ozan.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Availability of Data and Material

The data is strictly confidential, hence can not be shared in any circumstances.

Code Availability

The code is available in Jupyter Notebook format and can be shared up on request.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Ozan, Ş. Case studies on using natural language processing techniques in customer relationship management software. J Intell Inf Syst (2020).

Download citation


  • Customer relationship management
  • Word embeddings
  • Machine learning
  • Natural language processing
  • Recurrent neural networks