Skip to main content

Matching LIWC with Russian Thesauri: An Exploratory Study

  • Conference paper
  • First Online:
Artificial Intelligence and Natural Language (AINL 2020)

Abstract

In Author Profiling research, there is a growing interest in lexical resources providing various psychologically meaningful word categories. One of such instruments is Linguistic Inquiry and Word Count, which was compiled manually in English and translated into many other languages. We argue that the resource contains a lot of subjectivity, which is further increased in the translation process. As a result, the translated lexical resource is not linguistically transparent. In order to address this issue, we translate the resource from English to Russian semi-automatically, analyze the translation in terms of agreement and match the resulting translation with two Russian linguistic thesauri. One of the thesauri is chosen as a better match for the psychologically meaningful categories in question. We further apply the linguistic thesaurus to analyze the psychologically meaningful word categories in two Author Profiling tasks based on Russian texts. Our results indicate that linguistically-motivated thesauri not only provide objective and linguistically motivated content, but also result in significant correlates of certain psychological states, replicating evidence obtained with hand-crafted lexical resources.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Freely available for search and download at https://rusidiolect.rusprofilinglab.ru/.

References

  1. Pennebaker, J.W., King, L.A.: Linguistic styles: language use as an individual difference. Trans. Am. Math. Soc. 77(6), 1296 (1999)

    Google Scholar 

  2. Tausczik, Y.R., Pennebaker, J.W.: The psychological meaning of words: LIWC and computerized text analysis methods. J. Lang. Soc. Psychol. 29(1), 24–54 (2010)

    Article  Google Scholar 

  3. Boyd, R.L., Pennebaker, J.W.: Language-based personality: a new approach to personality in a digital world. Curr. Opin. Behav. Sci. 18, 63–68 (2017)

    Article  Google Scholar 

  4. Kailer, A., Chung, C.K.: The Russian LIWC2007 dictionary. LIWC.net, Technical report (2011)

    Google Scholar 

  5. Gao, R., Hao, B., Li, H., Gao, Y., Zhu, T.: Developing simplified chinese psychological linguistic analysis dictionary for microblog. In: Imamura, K., Usui, S., Shirao, T., Kasamatsu, T., Schwabe, L., Zhong, N. (eds.) BHI 2013. LNCS (LNAI), vol. 8211, pp. 359–368. Springer, Cham (2013). https://doi.org/10.1007/978-3-319-02753-1_36

    Chapter  Google Scholar 

  6. Bjekić, J., Lazarević, L.B., Živanović, M., Knežević, G.: Psychometric evaluation of the Serbian dictionary for automatic text analysis-LIWCser. Psihologija 47(1), 5–32 (2014)

    Article  Google Scholar 

  7. Van Wissen, L., Boot, P.: An electronic translation of the LIWC Dictionary into Dutch. In: Electronic lexicography in the 21st century: Proceedings of eLex 2017 conference, pp. 703–715. Lexical Computing (2017)

    Google Scholar 

  8. Meier, T., et al.: “LIWC auf Deutsch”: the development, psychometrics, and introduction of DE-LIWC2015. PsyArXiv (2019)

    Google Scholar 

  9. Pennebaker, J.W., Boyd, R.L., Jordan, K., Blackburn, K.: The development and psychometric properties of LIWC2015. The University of Texas at Austin (2015)

    Google Scholar 

  10. Litvinova, T., Litvinova, O., Seredin, P.: Dynamics of an idiostyle of a Russian suicidal blogger. In: Proceedings of the Fifth Workshop on Computational Linguistics and Clinical Psychology: From Keyboard to Clinic, pp. 158–167. Association for Computational Linguistics (2018)

    Google Scholar 

  11. Litvinova, T., Seredin, P., Litvinova, O., Dankova, T., Zagorovskaya, O.: On the stability of some idiolectal features. In: Karpov, A., Jokisch, O., Potapova, R. (eds.) SPECOM 2018. LNCS (LNAI), vol. 11096, pp. 331–336. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-99579-3_35

    Chapter  Google Scholar 

  12. Pennebaker, J.W.: The secret life of pronouns. New Sci. 211(2828), 42–45 (2011)

    Article  Google Scholar 

  13. Lukashevich, N.V.: Tezaurusy v zadachakh informatsionnogo poiska (Thesauri in Information Retrieval Problems), Moscow, Mosk. Gos. Univ (2011)

    Google Scholar 

  14. Loukachevitch, N., Dobrov, B.V.: RuThes linguistic ontology vs. Russian wordnets. In: Proceedings of the Seventh Global Wordnet Conference, pp. 154–162 (2014)

    Google Scholar 

  15. Babenko, L.G.: Slovar’ sinonimov russkogo yazyka [Dictionary of synonyms of the Russian language]. Astrel, Moscow (2011)

    Google Scholar 

  16. Settanni, M., Azucar, D., Marengo, D.: Predicting individual characteristics from digital traces on social media: a meta-analysis. Cyberpsychol. Behav. Soc. Netw. 21(4), 217–228 (2018)

    Article  Google Scholar 

  17. Schwartz, H.A., et al.: Personality, gender, and age in the language of social media: the open-vocabulary approach. PLoS ONE 8(9), e73791 (2013)

    Article  Google Scholar 

  18. Yarkoni, T.: Personality in 100,000 words: a large-scale analysis of personality and word use among bloggers. J. Res. Pers. 44(3), 363–373 (2010)

    Article  Google Scholar 

  19. Mairesse, F., Walker, M.A., Mehl, M.R., Moore, R.K.: Using linguistic cues for the automatic recognition of personality in conversation and text. J. Artif. Intell. Res. 30, 457–500 (2007)

    Article  Google Scholar 

  20. Luhmann, M.: Using big data to study subjective well-being. Curr. Opin. Behav. Sci. 18, 28–33 (2017)

    Article  Google Scholar 

  21. Wang, N., Kosinski, M., Stillwell, D.J., Rust, J.: Can well-being be measured using Facebook status updates? Validation of Facebook’s Gross National Happiness Index. Soc. Indic. Res. 115(1), 483–491 (2014)

    Article  Google Scholar 

  22. Settanni, M., Marengo, D.: Sharing feelings online: studying emotional well-being via automated text analysis of Facebook posts. Front. Psychol. 6, 1045 (2015)

    Google Scholar 

  23. Wojcik, S.P., Hovasapian, A., Graham, J., Motyl, M., Ditto, P.H.: Conservatives report, but liberals display, greater happiness. Science 347(6227), 1243–1246 (2015)

    Article  Google Scholar 

  24. Jones, N.M., Wojcik, S.P., Sweeting, J., Silver, R.C.: Tweeting negative emotion: an investigation of Twitter data in the aftermath of violence on college campuses. Psychol. Methods 21(4), 526 (2016)

    Article  Google Scholar 

  25. Hofmann, S.G., Moore, P.M., Gutner, C., Weeks, J.W.: Linguistic correlates of social anxiety disorder. Cogn. Emot. 26(4), 720–726 (2012)

    Article  Google Scholar 

  26. Coppersmith, G., Dredze, M., Harman, C.: Quantifying mental health signals in Twitter. In: Proceedings of the Workshop on Computational Linguistics and Clinical Psychology: From Linguistic Signal to Clinical Reality, pp. 51–60 (2014)

    Google Scholar 

  27. Wang, W., Hernandez, I., Newman, D.A., He, J., Bian, J.: Twitter analysis: studying US weekly trends in work stress and emotion. Appl. Psychol. 65(2), 355–378 (2016)

    Article  Google Scholar 

  28. Doré, B., Ort, L., Braverman, O., Ochsner, K.N.: Sadness shifts to anxiety over time and distance from the national tragedy in Newtown, Connecticut. Psychol. Sci. 26(4), 363–373 (2015)

    Article  Google Scholar 

  29. De Choudhury, M., Gamon, M., Counts, S., Horvitz, E.: Predicting depression via social media. In: Seventh International AAAI Conference on Weblogs and Social Media (2013)

    Google Scholar 

  30. Bird, S., Klein, E., Loper, E.: Natural Language Processing with Python: Analyzing Text with the Natural Language Toolkit. O’Reilly Media Inc., Sebastopol (2009)

    MATH  Google Scholar 

  31. Korobov, M.: Morphological analyzer and generator for Russian and Ukrainian languages. In: Khachay, M.Yu., Konstantinova, N., Panchenko, A., Ignatov, D.I., Labunets, V.G. (eds.) AIST 2015. CCIS, vol. 542, pp. 320–332. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-26123-2_31

    Chapter  Google Scholar 

  32. McCrae, R.R., Costa Jr., P.T.: Personality trait structure as a human universal. Am. Psychol. 52(5), 509 (1997)

    Article  Google Scholar 

  33. Snaith, R.P.: The hospital anxiety and depression scale. Health Qual. Life Outcomes 1(1), 29 (2003)

    Article  Google Scholar 

  34. Virtanen, P., et al.: SciPy 1.0: fundamental algorithms for scientific computing in Python. Nat. Methods 17(3), 261–272 (2020)

    Google Scholar 

  35. Litvinova, T., Litvinova, O., Zagorovskaya, O., Seredin, P., Sboev, A., Romanchenko, O.: Ruspersonality: a Russian corpus for authorship profiling and deception detection. In: 2016 International FRUCT Conference on Intelligence, Social Media and Web (ISMW FRUCT), pp. 1–7. IEEE (2016)

    Google Scholar 

  36. Litvinova, T., Seredin, P., Litvinova, O., Ryzhkova, E.: Estimating the similarities between texts of right-handed and left-handed males and females. In: Jones, G.J.F., et al. (eds.) CLEF 2017. LNCS, vol. 10456, pp. 119–124. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-65813-1_11

    Chapter  Google Scholar 

Download references

Acknowledgement

The authors acknowledge support of this study by the Russian Science Foundation grant №18-78-10081. The authors are grateful for the considerations provided by the anonymous reviewers.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Polina Panicheva .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Panicheva, P., Litvinova, T. (2020). Matching LIWC with Russian Thesauri: An Exploratory Study. In: Filchenkov, A., Kauttonen, J., Pivovarova, L. (eds) Artificial Intelligence and Natural Language. AINL 2020. Communications in Computer and Information Science, vol 1292. Springer, Cham. https://doi.org/10.1007/978-3-030-59082-6_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-59082-6_14

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-59081-9

  • Online ISBN: 978-3-030-59082-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics