Abstract
In this paper, we investigate the problem of the humor detection for Russian language. For experiments, we used a large collection of jokes from social media and a contrast collection of non-funny sentences, as well as a small collection of puns. We implemented a large set of features and trained several SVM classifiers. The results are promising and establish a baseline for further research in this direction.
Keywords
Stierlitz is a Soviet spy working deep undercover in Nazi Germany, a protagonist of a TV series from 1972 based on a novel by Yulian Semionov. Stierlitz became a popular joke character in Soviet and post-Soviet culture.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Attardo, S.: Linguistic Theories of Humor. Mouton de Gruyter, Berlin (1994)
Bolotova, V., et al.: Which IR model has a better sense of humor? Search over a large collection of jokes. In: Dialogue, pp. 29–42 (2017)
Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2, 27:1–27:27 (2011)
Korobov, M.: Morphological analyzer and generator for russian and ukrainian languages. In: Khachay, M.Y., Konstantinova, N., Panchenko, A., Ignatov, D.I., Labunets, V.G. (eds.) AIST 2015. CCIS, vol. 542, pp. 320–332. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-26123-2_31
Kutuzov, A., Kuzmenko, E.: WebVectors: a toolkit for building web interfaces for vector semantic models. In: Ignatov, D.I., et al. (eds.) AIST 2016. CCIS, vol. 661, pp. 155–161. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-52920-2_15
Mihalcea, R., Pulman, S.: Characterizing humour: an exploration of features in humorous texts. In: Gelbukh, A. (ed.) CICLing 2007. LNCS, vol. 4394, pp. 337–347. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-70939-8_30
Mihalcea, R., Strapparava, C.: Learning to laugh (automatically): computational models for humor recognition. Comput. Intell. 22(2), 126–142 (2006)
Miller, T., Hempelmann, C., Gurevych, I.: SemEval-2017 Task 7: detection and interpretation of English puns. In: SemEval (2017)
Potash, P., Romanov, A., Rumshisky, A.: SemEval-2017 Task 6: #HashtagWars: learning a sense of humor. In: SemEval, pp. 49–57 (2017)
Rajadesingan, A., Zafarani, R., Liu, H.: Sarcasm detection on Twitter: a behavioral modeling approach. In: Proceedings of WSDM, pp. 97–106 (2015)
Reyes, A., Rosso, P., Veale, T.: A multidimensional approach for detecting irony in Twitter. Language resources and evaluation 47(1), 239–268 (2013)
Shahaf, D., Horvitz, E., Mankoff, R.: Inside jokes: identifying humorous cartoon captions. In: Proceedings of KDD, pp. 1065–1074 (2015)
Yang, D., Lavie, A., Dyer, C., Hovy, E.: Humor recognition and humor anchor extraction. In: Proceedings of EMNLP, pp. 2367–2376 (2015)
Zhang, R., Liu, N.: Recognizing humor on Twitter. In: CIKM, pp. 889–898 (2014)
Acknowledgments
We thank Valeria Bolotova and Vladislav Blinov for sharing their humor dataset, as well as Natalia Loukachevitch for providing us with the RuWordNet data.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Ermilov, A., Murashkina, N., Goryacheva, V., Braslavski, P. (2018). Stierlitz Meets SVM: Humor Detection in Russian. In: Ustalov, D., Filchenkov, A., Pivovarova, L., Žižka, J. (eds) Artificial Intelligence and Natural Language. AINL 2018. Communications in Computer and Information Science, vol 930. Springer, Cham. https://doi.org/10.1007/978-3-030-01204-5_17
Download citation
DOI: https://doi.org/10.1007/978-3-030-01204-5_17
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-01203-8
Online ISBN: 978-3-030-01204-5
eBook Packages: Computer ScienceComputer Science (R0)