Abstract
Customer-voice data have an important role in different fields including marketing, product planning, and quality assurance. However, owing to the manual processes involved, there are problems associated with the classification of customer-voice data. This study focuses on building automatic classifiers for customer-voice data with newly proposed document representation methods based on neural-embedding and probabilistic word-clustering approaches. Semantically similar terms are classified into a common cluster. The words generated from neural embedding are clustered according to the membership strength of each word relative to each cluster derived from a probabilistic clustering method such as the fuzzy C-means clustering method or Gaussian mixture model. It is expected that the proposed method can be suitable for the classification of customer-voice data consisting of unstructured text by considering the membership strength. The results demonstrate that the proposed method achieved an accuracy of 89.24% with respect to representational effectiveness and an accuracy of 87.76% with respect to the classification performance of customer-voice data consisting of 12 classes. Further, the method provided an intuitive interpretation for the generated representation.
Similar content being viewed by others
References
Baeza-Yates R, Ribeiro-Neto B et al (1999) Modern information retrieval, vol 463. ACM Press, New York
Bekkerman R, El-Yaniv R, Tishby N, Winter Y (2003) Distributional word clusters vs. words for text categorization. J Mach Learn Res 3(Mar):1183–1208
Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res 3(Jan):993–1022
Bouziane H, Messabih B, Chouarfia A (2011) Profiles and majority voting-based ensemble method for protein secondary structure prediction. Evol Bioinform Online 7:171
Cai L, Hofmann T (2003) Text categorization by boosting automatically extracted concepts. In: Proceedings of the 26th annual international ACM SIGIR conference on research and development in informaion retrieval. ACM, pp 182–189
Cai Z, Hu X, Li H, Graesser A (2016) Can word probabilities from lda be simply added up to represent documents? In: Proceedings of the 9th international conference on educational data mining
Cost S, Salzberg S (1993) A weighted nearest neighbor algorithm for learning with symbolic features. Mach Learn 10(1):57–78
Cover T, Hart P (1967) Nearest neighbor pattern classification. IEEE Trans Inf Theory 13(1):21–27
Dai AM, Olah C, Le QV (2015) Document embedding with paragraph vectors. arXiv:1507.07998
Domingos P, Pazzani M (1997) On the optimality of the simple bayesian classifier under zero-one loss. Mach Learn 29(2–3):103–130
dos Santos CN, Gatti M (2014) Deep convolutional neural networks for sentiment analysis of short texts. In: Proceedings of COLING 2014, the 25th international conference on computational linguistics: technical papers, pp 69–78
Dumais ST (2004) Latent semantic analysis. Ann Rev Inf Sci Technol 38(1):188–230
Figueiredo MAT, Jain AK (2002) Unsupervised learning of finite mixture models. IEEE Trans Pattern Anal Mach Intell 24(3):381–396
Gallant SI (1993) Neural network learning and expert systems. MIT Press, Cambridge
Gaskin SP, Griffin A, Hauser JR, Katz GM, Klein RL (2010) Voice of the customer. Wiley, Hoboken
Ghayoomi M (2012) Word clustering for persian statistical parsing. In: Isahara H, Kanzaki K (eds) Advances in natural language processing. Springer, Berlin, Heidelberg, pp 126–137
Griffin A, Hauser JR (1993) The voice of the customer. Mark Sci 12(1):1–27
Harris ZS (1954) Distributional structure. Word 10(2–3):146–162
James CB (1981) Pattern recognition with fuzzy objective function algorithms. Kluwer Academic Publishers, Dordrecht
Katz GM (2001) The one right way to gather the voice of the customer. PDMA Vis Mag 25(4):1–6
Kim HK, Kim H, Cho S (2017) Bag-of-concepts: comprehending document representation through clustering words in distributed representation. Neurocomputing 266:336–352
Kim Y (2014) Convolutional neural networks for sentence classification. arXiv:1408.5882
Lai S, Xu L, Liu K, Zhao J (2015) Recurrent convolutional neural networks for text classification. In: Proceedings of the 29th international conference on artificial intelligence (AI’2015), vol 333, pp 2267–2273
Landauer TK, Foltz PW, Laham D (1998) An introduction to latent semantic analysis. Discourse Processes 25(2–3):259–284
Langley P, Iba W, Thompson K (1992) An analysis of bayesian classifiers. Aaai 90:223–228
Le QV, Mikolov T (2014) Distributed representations of sentences and documents. ICML 14:1188–1196
Lewis DD (1998) Naive (bayes) at forty: the independence assumption in information retrieval. In: European conference on machine learning. Springer, pp 4–15
Manning CD, Schütze H (1999) Foundations of statistical natural language processing, vol 999. MIT Press, Cambridge
McCulloch WS, Pitts W (1990) A logical calculus of the ideas immanent in nervous activity. Bull Math Biol 52(1–2):99–115
Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. arXiv:1301.3781
Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013) Distributed representations of words and phrases and their compositionality. In: Advances in neural information processing systems. pp 3111–3119
Mitrofanova O (2009) Automatic word clustering in studying semantic structure of texts. Adv Comput Linguist Res Comput Sci Mexico 41:27–34
Mucherino A, Papajorgji PJ, Pardalos PM (2009) k-Nearest neighbor classification. In: Data mining in agriculture. Springer, New York, pp 83–106
Orrite C, Rodríguez M, Martínez F, Fairhurst M (2008) Classifier ensemble generation for the majority vote rule. In: Ruiz-Shulcloper J, Kropatsch WG (eds) Iberoamerican congress on pattern recognition. Springer, Berlin, Heidelberg, pp 340–347
Sagae K, Gordon AS (2009) Clustering words by syntactic similarity improves dependency parsing of predicate-argument structures. In: Proceedings of the 11th international conference on parsing technologies. Association for Computational Linguistics, pp 192–201
Saha SK, Mitra P, Sarkar S (2008) Word clustering and word selection based feature reduction for MaxEnt based Hindi NER. In: Proceedings of ACL-08, HLT, pp 488–495
Sahlgren M (2006) The word-space model: using distributional analysis to represent syntagmatic and paradigmatic relations between words in high-dimensional vector spaces. Ph.D. thesis, Institutionen för lingvistik
Sayeedunnissa SF, Hussain AR, Hameed MA (2013) Supervised opinion mining of social network data using a bag-of-words approach on the cloud. In: Proceedings of seventh international conference on bio-inspired computing: theories and applications (BIC-TA 2012). Springer, pp 299–309
Steinwart I, Christmann A (2008) Support vector machines. Springer, Berlin
Suárez-Paniagua V, Segura-Bedmar I, Martínez P (2015) Word embedding clustering for disease named entity recognition. In: Proceedings of the fifth BioCreative challenge evaluation workshop. pp 299–304
Temkin BD, Chatham B, Amato M (2005) The customer experience value chain: an enterprisewide approach for meeting customer needs. Forrester Res
Vapnik VN, Vapnik V (1998) Statistical learning theory, vol 1. Wiley, New York
Vapnik V (1995) The nature of statistical learning theory. Springer, New York
Walker SH, Duncan DB (1967) Estimation of the probability of an event as a function of several independent variables. Biometrika 54(1–2):167–179
Xing C, Wang D, Zhang X, Liu C (2014) Document classification with distributions of word vectors. In: Signal and information processing association annual summit and conference (APSIPA), 2014 Asia-Pacific. IEEE, pp 1–5
Zhong S (2005) Efficient online spherical k-means clustering. In: Proceedings 2005 IEEE international joint conference on neural networks, 2005., vol 5. IEEE, pp 3180–3185
Acknowledgements
I would like to express my appreciation to LG Electronics who provided me the dataset of customer-voice used in experiments section in our study.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Lee, Y., Song, S., Cho, S. et al. Document representation based on probabilistic word clustering in customer-voice classification. Pattern Anal Applic 22, 221–232 (2019). https://doi.org/10.1007/s10044-018-00772-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10044-018-00772-1