Abstract
We propose an interactive technique to categorize the responses to open-ended questions. The open-ended question requires a response which is a natural language phrase. A typical analysis of the phrases starts with their ’coding’, that is, identifying themes of the responses and tagging the responses with the themes they represent. The proposed coding technique is based on interactive cluster analysis. We study theoretically and empirically the hierarchical (agglomerative, divisive) and partitional clustering algorithms to pick the best one for short Russian responses. We address the problem of the short phrase sparseness with thesaurus smoothing. We introduce an iterative process where users can provide some feedback to a clustering result. A domain-oriented system of statements is developed for users’ feedback. The system is proved to be able to provide any clusters the user desires. The technique is implemented as a web service for responses in Russian.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Geger, A. E.: The use of open-ended questions in the measurement of value orientations. In: Sociology Yesterday, Today, Tomorrow: 2-nd Sociological Readings in Memory of V.Golofast, pp. 48–60. SPb (2008)
Saganenko, G.I.: A comparison of non-comparable: study of comparative research on the basis of open-ended questions. Sociological Journal 3-4, 144–156 (1998)
Boyarsky, K.K., Kanevsky, E.A., Saganenko, G.I.: On the issue of automatic text classification. Economic-mathematical studies: mathematical models and information technology, 253–273(2009)
Carey, J.W., Morgan, M., Oxtoby, M.J.: Intercoder agreement in analysis of responses to open-ended interview questions: Examples from tuberculosis research. Cultural anthropology methods 8(3), 1–5 (1996)
Sakurai, S., Orihara, R.: Analysis of Textual Data based on multiple 2-class Classification Models. International Journal of Computational Intelligence 4(4) (2008)
Jivani, A.G.: A comparative study of stemming algorithms. Int. J. Comp. Tech. Appl. 2(6), 1930–1938 (2011)
Zamir, O., Etzioni, O.: Web document clustering: a feasibility demonstration. In: Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 46–54. ACM (1998)
Fung, B.C., Wang, K., Ester, M.: Hierarchical document clustering using frequent itemsets. SDM 3, 59–70 (2003)
Sahami, M., Heilman, T.D.: A web-based kernel function for measuring the similarity of short text snippets. In: Proceedings of the 15th International Conference on World Wide Web, 377–386. ACM (2006)
Loukashevich, N.V.: Thesauri in problems of information retrieval. Moscow University Printing House (2011)
Hong, L., Davison, B.D.: Empirical study of topic modeling in twitter. In: Proceedings of the First Workshop on Social Media Analytics, pp. 80–88. ACM (2010)
Jin, O., Liu, N.N., Zhao, K., Yu, Y., Yang, Q.: Transferring topical knowledge from auxiliary long texts for short text clustering. In: Proceedings of the 20th ACM International Conference on Information and Knowledge Management, pp. 775–784. ACM (2011)
Siolas, G., d’Alché-Buc, F.: Support vector machines based on a semantic kernel for text categorization. In: Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks, IJCNN 2000, vol. 5, pp. 205–209. IEEE (2000)
Varlamov, M. I., Korshunov A. V.: Computing semantic similarity of concepts using shortest paths in Wikipedia link graph. JMLDA, 1107–1125 (2014)
Lance, G.N., Williams, W.T.: A general theory of classificatory sorting strategies 1 Hierarchical systems. The Computer Journal 9(4), 373–380 (1967)
Zhao, Y., Karypis, G.: Evaluation of hierarchical clustering algorithms for document datasets. In: Proceedings of the Eleventh International Conference on Information and Knowledge Management, pp. 515–524. ACM (2002)
Steinbach, M., Karypis, G., Kumar, V.: A comparison of document clustering techniques. KDD workshop on text mining 400(1), 525–526 (2000)
Buchta, C., Kober, M., Feinerer, I., Hornik, K.: Spherical k-means clustering. Journal of Statistical Software 50(10), 1–22 (2012)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Senderovich, N., Maysuradze, A. (2015). Interactive Coding of Responses to Open-Ended Questions in Russian. In: Klinov, P., Mouromtsev, D. (eds) Knowledge Engineering and Semantic Web. KESW 2015. Communications in Computer and Information Science, vol 518. Springer, Cham. https://doi.org/10.1007/978-3-319-24543-0_15
Download citation
DOI: https://doi.org/10.1007/978-3-319-24543-0_15
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-24542-3
Online ISBN: 978-3-319-24543-0
eBook Packages: Computer ScienceComputer Science (R0)