Abstract
This article describes an algorithm for clustering messages from user dialogues. We focus on the fact that the quality of clustering is significantly affected by the number of user questions included in the analyzed subset. The technique was tested on dialogues of Telecom domain, each dialogue can include one to eight questions. The algorithm involves the use of basic and additional methods of data preprocessing, methods of feature extraction, data augmentation, dimensionality reduction method, comparative analysis of the application of clustering methods. The article presents a comparison results of the bag-of-word model, agglomerative clustering and k-means clustering on the sets with different number of users questions. It is shown that the best cluster results are obtained when only the first user questions are included in the analyzed subset.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Hiraoka, T., Tsuchida, M., Watanabe, Y.: Deep Reinforcement Learning for Inquiry Dialog Policies with Logical Formula Embeddings (2017)
Liu, H., Lin, T., Sun, H., Lin, W., Chang, C.-W., Zhong, T., Rudnicky, A.: RubyStar: A Non-Task-Oriented Mixture Model Dialog System (2017)
Koltsov, S., Pashakhin, S., Dokuka, S.: A full-cycle methodology for news topic modeling and user feedback research. In: Staab, S., Koltsova, O., Ignatov, D. (eds.) Social Informatics. SocInfo 2018. LNCS, vol. 11185, pp. 308–321. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01129-1_19
Sanandres, E., Llanos, R., Madariaga, C.: Topic Modeling of Twitter Conversations (2018)
Shilkina, N., Maltseva, A., Makhnytkina, O., Titova, M., Gubernatorova, E., Katsko, I., Mirzabalaeva, F., Shusharina, S.: Social media as a display of students’ communication culture: case of educational, professional and labor verbal markers analysis. In: Communications in Computer and Information Science, pp. 384–397 (2019)
Liu, L., Huang, H., Gao, Y., Zhang, Y., Wei, X.: Neural variational correlated topic modeling. In: The World Wide Web Conference, pp. 1142–1152 (2019)
Ram, A., Prasad, R., Khatri, C., Venkatesh, A.: Conversational AI: the science behind the alexa prize (2017)
Boteanu, A., Chernova, S.: Modeling topics in user dialog for interactive tablet media. In: AAAI Workshop, pp. 2–8 (2012)
Hisano, R.: Learning topic models by neighborhood aggregation. In: Twenty-Eighth International Joint Conference on Artificial Intelligence IJCAI 2019, pp. 2498–2505 (2019)
Akhtar, N., Beg, M., Javed, H.: Topic modelling with fuzzy document representation. In: Advances in Computing and Data Sciences, pp. 577–587 (2019)
Dieng, A., Ruiz, F., Blei, D.: The Dynamic Embedded Topic Model (2019)
Nugmanova, A., Smirnov, A., Lavrentyeva, G., Chernykh, I.: Strategy of the negative sampling for training retrieval-based dialogue systems. In: IEEE International Conference on Pervasive Computing and Communications Workshops, pp. 844–848 (2019)
Zhang, P., Wang, S., Li, D., Li, X., Xu, Z.: Combine topic modeling with semantic embedding: embedding enhanced topic model. IEEE Trans. Knowl. Data Eng. 1 (2019)
Mao, Q., Feng, B., Pan, S.: A Bayesian nonparametric topic model for user interest modeling. In: Conference: 2014 IEEE 17th International Conference on Computational Science and Engineering, pp. 527–534 (2014)
Mähr, M., Hoffmann, H., Zetti, D.: Topic modelling and explorative search. In: Conference: Workshop DARIAH-CH (2018)
Korshunova, I., Xiong, H., Fedoryszak, M., Theis, L.: Discriminative Topic Modeling with Logistic LDA (2019)
Tkachenko, M., Lauw, H.: CompareLDA: a topic model for document comparison. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 7112–7119 (2019)
Yang, Y., Wang, F., Jiang, F., Jin, S., Xu, J.: A topic model for hierarchical documents. In: 1st IEEE International Conference on Data Science in Cyberspace (2016)
Gerlach, M., Peixoto, T., Altmann, E.: A network approach to topic models. Sci. Adv. (2018)
Pfeifer, D., Leidner, J.: Topic Grouper: An Agglomerative Clustering Approach to Topic Modeling (2019)
Iwata, T., Hirao, T., Ueda, N.: Topic models for unsupervised cluster matching. IEEE Trans. Knowl. Data Eng. 1 (2017)
Krasnashchok, K., Cherif, A.: Coherence regularization for neural topic models. In: Advances in Neural Networks (2019)
Nan, F., Ding, R., Nallapati, R., Xiang, B.: Topic Modeling with Wasserstein Autoencoders (2019)
Khatri, C., Goel, R., Hedayatnia, B., Metanillou, A., Venkatesh, A., Gabriel, R., Mandal, A.: Contextual topic modeling for dialog systems. In: Conference: 2018 IEEE Spoken Language Technology Workshop (SLT), pp. 892–899 (2018)
Ma, Y., Fosler-Lussier, E.: Detecting ‘Request Alternatives’ user dialog acts from dialog context. In: Situated Dialog in Speech-Based Human-Computer Interaction (2016)
Acknowledgments
This work was partially financially supported by the Government of the Russian Federation (Grant 08-08).
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Feldina, E., Makhnytkina, O. (2021). Clustering Approach to Topic Modeling in Users Dialogue. In: Arai, K., Kapoor, S., Bhatia, R. (eds) Intelligent Systems and Applications. IntelliSys 2020. Advances in Intelligent Systems and Computing, vol 1251. Springer, Cham. https://doi.org/10.1007/978-3-030-55187-2_44
Download citation
DOI: https://doi.org/10.1007/978-3-030-55187-2_44
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-55186-5
Online ISBN: 978-3-030-55187-2
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)