Abstract
Observing the current state of natural language processing, especially in the Polish language, one notices that sense-level dictionaries are becoming increasingly popular. For instance, the largest manually annotated sentiment dictionary for Polish is now based on plWordNet (the Polish WordNet) [13], also the Polish Linguistic Category Model (LCM-PL) [10] dictionary has its significant part annotated on sense level. Our paper addresses the important question: what is the influence of word sense disambiguation in real-world scenarios and how it compares to the simpler baseline of labeling using just the tag of the most frequent sense. We evaluate both approaches on data sets compiled for studies on fake opinion detection and predicting levels of self-esteem in the area of social psychology. Our conclusion is that the baseline method vastly outperforms its competitor.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Our data sets have been processed in March 2018. We have no information about version of the WSD module available at that time, including no information on potential open bugs that might influence sense annotation.
- 2.
The study was conducted by members of the Warsaw Evaluative Learning Lab headed by professor Robert Balas.
References
Beukeboom, C., Tanis, M., Vermeulen, I.: The language of extraversion: extraverted people talk more abstractly, introverts are more concrete. J. Lang. Soc. Psychol. 32(2), 191–201 (2013)
Hoorens, V.: What’s really in a name-letter effect? Name-letter preferences as indirect measures of self-esteem. Eur. Rev. Soc. Psychol. 25(1), 228–262 (2014)
Kędzia, P., Piasecki, M., Orlińska, M.: Word sense disambiguation based on large scale Polish Clarin heterogeneous lexical resources. Cogn. Stud. 15, 269–292 (2015)
Robins, R.W., Hendin, H.M., Trzesniewski, K.H.: Measuring global self-esteem: construct validation of a single-item measure and the rosenberg self-esteem scale. Pers. Soc. Psychol. Bull. 27(2), 151–161 (2001)
Rubikowski, M., Wawer, A.: The scent of deception: recognizing fake perfume reviews in polish. In: Kłopotek, M.A., Koronacki, J., Marciniak, M., Mykowiecka, A., Wierzchoń, S.T. (eds.) IIS 2013. LNCS, vol. 7912, pp. 45–49. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-38634-3_6
Rubini, M., Sigall, H.: Taking the edge off of disagreement: linguistic abstractness and self-presentation to a heterogeneous audience. Eur. J. Soc. Psychol. 32(3), 343–351 (2002)
Semin, G.R., Fiedler, K.: The cognitive functions of linguistic categories in describing persons: social cognition and language. J. Pers. Soc. Psychol. 54(4), 558 (1988)
Smith, E.R., Mackie, D.M., Claypool, H.M.: Social Psychology. Psychology Press, Hove (2014)
Wawer, A., Ogrodniczuk, M.: Results of the PolEval 2017 competition: sentiment analysis shared task. In: 8th Language and Technology Conference: Human Language Technologies as a Challenge for Computer Science and Linguistics (2017)
Wawer, A., Sarzyńska, J.: The linguistic category model in polish (LCM-PL). In: Chair, N.C.C., et al. (eds.) Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018). European Language Resources Association (ELRA), Paris, France, May 2018
Wigboldus, D.H., Semin, G.R., Spears, R.: How do we communicate stereotypes? Linguistic bases and inferential consequences. J. Pers. Soc. Psychol. 78(1), 5 (2000)
Youyou, W., Kosinski, M., Stillwell, D.: Computer-based personality judgments are more accurate than those made by humans. Proc. Natl. Acad. Sci. 112(4), 1036–1040 (2015)
Zaśko-Zielińska, M., Piasecki, M., Szpakowicz, S.: A large wordnet-based sentiment lexicon for Polish. In: Proceedings of the International Conference Recent Advances in Natural Language Processing, pp. 721–730. INCOMA Ltd., Shoumen, BULGARIA, Hissar, Bulgaria, September 2015. http://www.aclweb.org/anthology/R15-1092
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Wawer, A., Sarzyńska, J. (2018). Do We Need Word Sense Disambiguation for LCM Tagging?. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds) Text, Speech, and Dialogue. TSD 2018. Lecture Notes in Computer Science(), vol 11107. Springer, Cham. https://doi.org/10.1007/978-3-030-00794-2_21
Download citation
DOI: https://doi.org/10.1007/978-3-030-00794-2_21
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-00793-5
Online ISBN: 978-3-030-00794-2
eBook Packages: Computer ScienceComputer Science (R0)