International Conference on Knowledge Engineering and the Semantic Web

Knowledge Engineering and Semantic Web pp 195-209 | Cite as

Interactive Coding of Responses to Open-Ended Questions in Russian

Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 518)

Abstract

We propose an interactive technique to categorize the responses to open-ended questions. The open-ended question requires a response which is a natural language phrase. A typical analysis of the phrases starts with their ’coding’, that is, identifying themes of the responses and tagging the responses with the themes they represent. The proposed coding technique is based on interactive cluster analysis. We study theoretically and empirically the hierarchical (agglomerative, divisive) and partitional clustering algorithms to pick the best one for short Russian responses. We address the problem of the short phrase sparseness with thesaurus smoothing. We introduce an iterative process where users can provide some feedback to a clustering result. A domain-oriented system of statements is developed for users’ feedback. The system is proved to be able to provide any clusters the user desires. The technique is implemented as a web service for responses in Russian.

Keywords

Open-ended questions Short text categorization Interactive clustering Russian thesaurus 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Geger, A. E.: The use of open-ended questions in the measurement of value orientations. In: Sociology Yesterday, Today, Tomorrow: 2-nd Sociological Readings in Memory of V.Golofast, pp. 48–60. SPb (2008)Google Scholar
  2. 2.
    Saganenko, G.I.: A comparison of non-comparable: study of comparative research on the basis of open-ended questions. Sociological Journal 3-4, 144–156 (1998)Google Scholar
  3. 3.
    Boyarsky, K.K., Kanevsky, E.A., Saganenko, G.I.: On the issue of automatic text classification. Economic-mathematical studies: mathematical models and information technology, 253–273(2009)Google Scholar
  4. 4.
    Carey, J.W., Morgan, M., Oxtoby, M.J.: Intercoder agreement in analysis of responses to open-ended interview questions: Examples from tuberculosis research. Cultural anthropology methods 8(3), 1–5 (1996)Google Scholar
  5. 5.
    Sakurai, S., Orihara, R.: Analysis of Textual Data based on multiple 2-class Classification Models. International Journal of Computational Intelligence 4(4) (2008)Google Scholar
  6. 6.
    Jivani, A.G.: A comparative study of stemming algorithms. Int. J. Comp. Tech. Appl. 2(6), 1930–1938 (2011)Google Scholar
  7. 7.
    Zamir, O., Etzioni, O.: Web document clustering: a feasibility demonstration. In: Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 46–54. ACM (1998)Google Scholar
  8. 8.
    Fung, B.C., Wang, K., Ester, M.: Hierarchical document clustering using frequent itemsets. SDM 3, 59–70 (2003)Google Scholar
  9. 9.
    Sahami, M., Heilman, T.D.: A web-based kernel function for measuring the similarity of short text snippets. In: Proceedings of the 15th International Conference on World Wide Web, 377–386. ACM (2006)Google Scholar
  10. 10.
    Loukashevich, N.V.: Thesauri in problems of information retrieval. Moscow University Printing House (2011)Google Scholar
  11. 11.
    Hong, L., Davison, B.D.: Empirical study of topic modeling in twitter. In: Proceedings of the First Workshop on Social Media Analytics, pp. 80–88. ACM (2010)Google Scholar
  12. 12.
    Jin, O., Liu, N.N., Zhao, K., Yu, Y., Yang, Q.: Transferring topical knowledge from auxiliary long texts for short text clustering. In: Proceedings of the 20th ACM International Conference on Information and Knowledge Management, pp. 775–784. ACM (2011)Google Scholar
  13. 13.
    Siolas, G., d’Alché-Buc, F.: Support vector machines based on a semantic kernel for text categorization. In: Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks, IJCNN 2000, vol. 5, pp. 205–209. IEEE (2000)Google Scholar
  14. 14.
    Varlamov, M. I., Korshunov A. V.: Computing semantic similarity of concepts using shortest paths in Wikipedia link graph. JMLDA, 1107–1125 (2014)Google Scholar
  15. 15.
    Lance, G.N., Williams, W.T.: A general theory of classificatory sorting strategies 1 Hierarchical systems. The Computer Journal 9(4), 373–380 (1967)CrossRefGoogle Scholar
  16. 16.
    Zhao, Y., Karypis, G.: Evaluation of hierarchical clustering algorithms for document datasets. In: Proceedings of the Eleventh International Conference on Information and Knowledge Management, pp. 515–524. ACM (2002)Google Scholar
  17. 17.
    Steinbach, M., Karypis, G., Kumar, V.: A comparison of document clustering techniques. KDD workshop on text mining 400(1), 525–526 (2000)Google Scholar
  18. 18.
    Buchta, C., Kober, M., Feinerer, I., Hornik, K.: Spherical k-means clustering. Journal of Statistical Software 50(10), 1–22 (2012)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  1. 1.Faculty of Computational Mathematics and CyberneticsLomonosov Moscow State UniversityMoscowRussia

Personalised recommendations