Abstract
The present paper deals with word sense induction from lexical co-occurrence graphs. We construct such graphs on large Russian corpora and then apply the data to cluster the results of Mail.ru search according to meanings in the query. We compare different methods of performing such clustering and different source corpora. Models of applying distributional semantics to big linguistic data are described.
References
Harris, Z.S.: Distributional Structure. Springer, Heidelberg (1970)
Bybee, J.: Frequency of use and the Organization of Language. Oxford University Press, USA (2006)
Kilgarriff, A.: Dictionary word sense distinctions: An enquiry into their nature. Comput. Humanit. 26(5–6), 365–387 (1992)
Schütze, H., Pedersen, J.O.: Information retrieval based on word senses. In: Proceedings 4th Annual Symposium on Document Analysis and Information Retrieval (SDAIR 1995), pp. 161–175 (1995)
Navigli, R.: A quick tour of word sense disambiguation, induction and related approaches. In: Bieliková, M., Friedrich, G., Gottlob, G., Katzenbeisser, S., Turán, G. (eds.) SOFSEM 2012. LNCS, vol. 7147, pp. 115–129. Springer, Heidelberg (2012)
Marco, A.D., Navigli, R.: Clustering and diversifying web search results with graph-based word sense induction. Comput. Linguist. 39(3), 709–754 (2013)
Véronis, J.: Hyperlex: lexical cartography for information retrieval. Comput. Speech Lang. 18(3), 223–252 (2004)
Padró, L., Stanilovsky, E.: Freeling 3.0: Towards wider multilinguality. In: Calzolari, N., Choukri, K., Declerck, T., Doğan, M.U., Maegaard, B., Mariani, J., Odijk, J., Piperidis, S. (eds.) Proceedings of the Eight International Conference on LanguageResources and Evaluation (LREC’12). Istanbul, Turkey, European LanguageResources Association (ELRA), May 2012
Smadja, F., McKeown, K.R., Hatzivassiloglou, V.: Translating collocations for bilingual lexicons: A statistical approach. Comput. Linguist. 22(1), 1–38 (1996)
Watts, D.J., Strogatz, S.H.: Collective dynamics of ‘small-world’ networks. Nature 393(6684), 440–442 (1998)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Kutuzov, A. (2015). Semantic Clustering of Russian Web Search Results: Possibilities and Problems. In: Braslavski, P., Karpov, N., Worring, M., Volkovich, Y., Ignatov, D.I. (eds) Information Retrieval. RuSSIR 2014. Communications in Computer and Information Science, vol 505. Springer, Cham. https://doi.org/10.1007/978-3-319-25485-2_12
Download citation
DOI: https://doi.org/10.1007/978-3-319-25485-2_12
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-25484-5
Online ISBN: 978-3-319-25485-2
eBook Packages: Computer ScienceComputer Science (R0)