Skip to main content
Log in

Document representation based on probabilistic word clustering in customer-voice classification

  • Industrial and commercial application
  • Published:
Pattern Analysis and Applications Aims and scope Submit manuscript

Abstract

Customer-voice data have an important role in different fields including marketing, product planning, and quality assurance. However, owing to the manual processes involved, there are problems associated with the classification of customer-voice data. This study focuses on building automatic classifiers for customer-voice data with newly proposed document representation methods based on neural-embedding and probabilistic word-clustering approaches. Semantically similar terms are classified into a common cluster. The words generated from neural embedding are clustered according to the membership strength of each word relative to each cluster derived from a probabilistic clustering method such as the fuzzy C-means clustering method or Gaussian mixture model. It is expected that the proposed method can be suitable for the classification of customer-voice data consisting of unstructured text by considering the membership strength. The results demonstrate that the proposed method achieved an accuracy of 89.24% with respect to representational effectiveness and an accuracy of 87.76% with respect to the classification performance of customer-voice data consisting of 12 classes. Further, the method provided an intuitive interpretation for the generated representation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Baeza-Yates R, Ribeiro-Neto B et al (1999) Modern information retrieval, vol 463. ACM Press, New York

    Google Scholar 

  2. Bekkerman R, El-Yaniv R, Tishby N, Winter Y (2003) Distributional word clusters vs. words for text categorization. J Mach Learn Res 3(Mar):1183–1208

    MATH  Google Scholar 

  3. Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res 3(Jan):993–1022

    MATH  Google Scholar 

  4. Bouziane H, Messabih B, Chouarfia A (2011) Profiles and majority voting-based ensemble method for protein secondary structure prediction. Evol Bioinform Online 7:171

    Article  Google Scholar 

  5. Cai L, Hofmann T (2003) Text categorization by boosting automatically extracted concepts. In: Proceedings of the 26th annual international ACM SIGIR conference on research and development in informaion retrieval. ACM, pp 182–189

  6. Cai Z, Hu X, Li H, Graesser A (2016) Can word probabilities from lda be simply added up to represent documents? In: Proceedings of the 9th international conference on educational data mining

  7. Cost S, Salzberg S (1993) A weighted nearest neighbor algorithm for learning with symbolic features. Mach Learn 10(1):57–78

    Google Scholar 

  8. Cover T, Hart P (1967) Nearest neighbor pattern classification. IEEE Trans Inf Theory 13(1):21–27

    Article  MATH  Google Scholar 

  9. Dai AM, Olah C, Le QV (2015) Document embedding with paragraph vectors. arXiv:1507.07998

  10. Domingos P, Pazzani M (1997) On the optimality of the simple bayesian classifier under zero-one loss. Mach Learn 29(2–3):103–130

    Article  MATH  Google Scholar 

  11. dos Santos CN, Gatti M (2014) Deep convolutional neural networks for sentiment analysis of short texts. In: Proceedings of COLING 2014, the 25th international conference on computational linguistics: technical papers, pp 69–78

  12. Dumais ST (2004) Latent semantic analysis. Ann Rev Inf Sci Technol 38(1):188–230

    Article  Google Scholar 

  13. Figueiredo MAT, Jain AK (2002) Unsupervised learning of finite mixture models. IEEE Trans Pattern Anal Mach Intell 24(3):381–396

    Article  Google Scholar 

  14. Gallant SI (1993) Neural network learning and expert systems. MIT Press, Cambridge

    Book  MATH  Google Scholar 

  15. Gaskin SP, Griffin A, Hauser JR, Katz GM, Klein RL (2010) Voice of the customer. Wiley, Hoboken

    Book  Google Scholar 

  16. Ghayoomi M (2012) Word clustering for persian statistical parsing. In: Isahara H, Kanzaki K (eds) Advances in natural language processing. Springer, Berlin, Heidelberg, pp 126–137

    Chapter  Google Scholar 

  17. Griffin A, Hauser JR (1993) The voice of the customer. Mark Sci 12(1):1–27

    Article  Google Scholar 

  18. Harris ZS (1954) Distributional structure. Word 10(2–3):146–162

    Article  Google Scholar 

  19. James CB (1981) Pattern recognition with fuzzy objective function algorithms. Kluwer Academic Publishers, Dordrecht

    MATH  Google Scholar 

  20. Katz GM (2001) The one right way to gather the voice of the customer. PDMA Vis Mag 25(4):1–6

    Google Scholar 

  21. Kim HK, Kim H, Cho S (2017) Bag-of-concepts: comprehending document representation through clustering words in distributed representation. Neurocomputing 266:336–352

    Article  Google Scholar 

  22. Kim Y (2014) Convolutional neural networks for sentence classification. arXiv:1408.5882

  23. Lai S, Xu L, Liu K, Zhao J (2015) Recurrent convolutional neural networks for text classification. In: Proceedings of the 29th international conference on artificial intelligence (AI’2015), vol 333, pp 2267–2273

  24. Landauer TK, Foltz PW, Laham D (1998) An introduction to latent semantic analysis. Discourse Processes 25(2–3):259–284

    Article  Google Scholar 

  25. Langley P, Iba W, Thompson K (1992) An analysis of bayesian classifiers. Aaai 90:223–228

    Google Scholar 

  26. Le QV, Mikolov T (2014) Distributed representations of sentences and documents. ICML 14:1188–1196

    Google Scholar 

  27. Lewis DD (1998) Naive (bayes) at forty: the independence assumption in information retrieval. In: European conference on machine learning. Springer, pp 4–15

  28. Manning CD, Schütze H (1999) Foundations of statistical natural language processing, vol 999. MIT Press, Cambridge

    MATH  Google Scholar 

  29. McCulloch WS, Pitts W (1990) A logical calculus of the ideas immanent in nervous activity. Bull Math Biol 52(1–2):99–115

    Article  MATH  Google Scholar 

  30. Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. arXiv:1301.3781

  31. Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013) Distributed representations of words and phrases and their compositionality. In: Advances in neural information processing systems. pp 3111–3119

  32. Mitrofanova O (2009) Automatic word clustering in studying semantic structure of texts. Adv Comput Linguist Res Comput Sci Mexico 41:27–34

    Google Scholar 

  33. Mucherino A, Papajorgji PJ, Pardalos PM (2009) k-Nearest neighbor classification. In: Data mining in agriculture. Springer, New York, pp 83–106

  34. Orrite C, Rodríguez M, Martínez F, Fairhurst M (2008) Classifier ensemble generation for the majority vote rule. In: Ruiz-Shulcloper J, Kropatsch WG (eds) Iberoamerican congress on pattern recognition. Springer, Berlin, Heidelberg, pp 340–347

    Google Scholar 

  35. Sagae K, Gordon AS (2009) Clustering words by syntactic similarity improves dependency parsing of predicate-argument structures. In: Proceedings of the 11th international conference on parsing technologies. Association for Computational Linguistics, pp 192–201

  36. Saha SK, Mitra P, Sarkar S (2008) Word clustering and word selection based feature reduction for MaxEnt based Hindi NER. In: Proceedings of ACL-08, HLT, pp 488–495

  37. Sahlgren M (2006) The word-space model: using distributional analysis to represent syntagmatic and paradigmatic relations between words in high-dimensional vector spaces. Ph.D. thesis, Institutionen för lingvistik

  38. Sayeedunnissa SF, Hussain AR, Hameed MA (2013) Supervised opinion mining of social network data using a bag-of-words approach on the cloud. In: Proceedings of seventh international conference on bio-inspired computing: theories and applications (BIC-TA 2012). Springer, pp 299–309

  39. Steinwart I, Christmann A (2008) Support vector machines. Springer, Berlin

    MATH  Google Scholar 

  40. Suárez-Paniagua V, Segura-Bedmar I, Martínez P (2015) Word embedding clustering for disease named entity recognition. In: Proceedings of the fifth BioCreative challenge evaluation workshop. pp 299–304

  41. Temkin BD, Chatham B, Amato M (2005) The customer experience value chain: an enterprisewide approach for meeting customer needs. Forrester Res

  42. Vapnik VN, Vapnik V (1998) Statistical learning theory, vol 1. Wiley, New York

    MATH  Google Scholar 

  43. Vapnik V (1995) The nature of statistical learning theory. Springer, New York

    Book  MATH  Google Scholar 

  44. Walker SH, Duncan DB (1967) Estimation of the probability of an event as a function of several independent variables. Biometrika 54(1–2):167–179

    Article  MathSciNet  MATH  Google Scholar 

  45. Xing C, Wang D, Zhang X, Liu C (2014) Document classification with distributions of word vectors. In: Signal and information processing association annual summit and conference (APSIPA), 2014 Asia-Pacific. IEEE, pp 1–5

  46. Zhong S (2005) Efficient online spherical k-means clustering. In: Proceedings 2005 IEEE international joint conference on neural networks, 2005., vol 5. IEEE, pp 3180–3185

Download references

Acknowledgements

I would like to express my appreciation to LG Electronics who provided me the dataset of customer-voice used in experiments section in our study.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sungzoon Cho.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lee, Y., Song, S., Cho, S. et al. Document representation based on probabilistic word clustering in customer-voice classification. Pattern Anal Applic 22, 221–232 (2019). https://doi.org/10.1007/s10044-018-00772-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10044-018-00772-1

Keywords

Navigation