Crosslingual Depression Detection in Twitter Using Bilingual Word Alignments

Coello-Guilarte, Laritza; Ortega-Mendoza, Rosa María; Villaseñor-Pineda, Luis; Montes-y-Gómez, Manuel

doi:10.1007/978-3-030-28577-7_2

Laritza Coello-Guilarte¹⁷,
Rosa María Ortega-Mendoza¹⁷,
Luis Villaseñor-Pineda¹⁷ &
…
Manuel Montes-y-Gómez¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11696))

Included in the following conference series:

International Conference of the Cross-Language Evaluation Forum for European Languages

1321 Accesses
7 Citations

Abstract

Depression is a mental disorder with strong social and economic implications. Due to its relevance, recently several researches have explored the analysis of social media content to identify and track depressed users. Most approaches follow a supervised learning strategy supported on the availability of labeled training data. Unfortunately, acquiring such data is very complex and costly. To handle this problem, in this paper we propose a crosslingual approach based on the idea that data already labeled in a specific language can be leveraged to classify depression in other languages. The proposed method is based on a word-level alignment process. Particularly, we propose two representations for the alignment; one of them takes advantage of the psycholinguistic resource LIWC and the other uses bilingual word embeddings. For evaluating the proposed approach, we faced the detection of depression by employing English and Spanish tweets as the source and target data respectively. The results outperformed solutions based on automatic translation of texts, confirming the usefulness of the proposed approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
In the experiments, the translation of the seed words was done by means of Google Translator.
2.
We are interested in using content features, therefore the stop words are ignored.
3.
https://ccc.inaoep.mx/~mmontesg/resources/CrossLingualDepression.zip.
4.
English data was taken from [20] and Spanish data was taken from [3].
5.
An offensive term in Spanish.

References

Abdalla, M., Hirst, G.: Cross-lingual sentiment analysis without (good) translation. arXiv preprint arXiv:1707.01626 (2017)
Al-Shabi, A., Adel, A., Omar, N., Al-Moslmi, T.: Cross-lingual sentiment classification from english to arabic using machine translation. Int. J. Adv. Comput. Sci. Appl. (IJACSA) 8(12), 434–440 (2017)
Google Scholar
Álvarez-Carmona, M.Á.: Author profiling in social media with multimodal information. Ph.D. thesis, Instituto Nacional de Astrofísica, Óptica y Electrónica (2019)
Google Scholar
Artetxe, M., Labaka, G., Agirre, E.: Learning bilingual word embeddings with (almost) no bilingual data. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), vol. 1, pp. 451–462 (2017)
Google Scholar
Coppersmith, G., Dredze, M., Harman, C.: Quantifying mental health signals in Twitter. In: Proceedings of the Workshop on Computational Linguistics and Clinical Psychology: From Linguistic Signal to Clinical Reality, pp. 51–60 (2014)
Google Scholar
Coppersmith, G., Dredze, M., Harman, C., Hollingshead, K., Mitchell, M.: Clpsych 2015 shared task: depression and PTSD on Twitter. In: Proceedings of the 2nd Workshop on Computational Linguistics and Clinical Psychology: From Linguistic Signal to Clinical Reality, pp. 31–39 (2015)
Google Scholar
De Choudhury, M., Gamon, M., Counts, S., Horvitz, E.: Predicting depression via social media. In: Seventh International AAAI Conference on Weblogs and Social Media (2013)
Google Scholar
Gliozzo, A., Strapparava, C.: Exploiting comparable corpora and bilingual dictionaries for cross-language text categorization. In: Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics, pp. 553–560. Association for Computational Linguistics (2006)
Google Scholar
Guntuku, S.C., Yaden, D.B., Kern, M.L., Ungar, L.H., Eichstaedt, J.C.: Detecting depression and mental illness on social media: an integrative review. Curr. Opin. Beha. Sci. 18, 43–49 (2017)
Article Google Scholar
Losada, D.E., Crestani, F., Parapar, J.: eRISK 2017: CLEF lab on early risk prediction on the internet: experimental foundations. In: Jones, G.J.F., et al. (eds.) CLEF 2017. LNCS, vol. 10456, pp. 346–360. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-65813-1_30
Chapter Google Scholar
Losada, D.E., Crestani, F., Parapar, J.: Overview of eRisk: early risk prediction on the internet. In: Bellot, P., et al. (eds.) CLEF 2018. LNCS, vol. 11018, pp. 343–361. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-98932-7_30
Chapter Google Scholar
Mikolov, T., Le, Q.V., Sutskever, I.: Exploiting similarities among languages for machine translation. arXiv preprint arXiv:1309.4168 (2013)
Nadeem, M.: Identifying depression on Twitter. arXiv preprint arXiv:1607.07384 (2016)
Pedersen, T.: Screening Twitter users for depression and PTSD with lexical decision lists. In: Proceedings of the 2nd Workshop on Computational Linguistics and Clinical Psychology: From Linguistic Signal to Clinical Reality, pp. 46–53 (2015)
Google Scholar
Pennebaker, J.W., Booth, R.J., Francis, M.E.: LIWC2007: linguistic inquiry and word count. LIWC.net, Austin (2007)
Google Scholar
Prettenhofer, P., Stein, B.: Cross-language text classification using structural correspondence learning. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pp. 1118–1127 (2010)
Google Scholar
Reece, A.G., Reagan, A.J., Lix, K.L., Dodds, P.S., Danforth, C.M., Langer, E.J.: Forecasting the onset and course of mental illness with twitter data. Sci. Rep. 7(1), 13006 (2017)
Article Google Scholar
Ruder, S.: A survey of cross-lingual word embedding models. CoRR abs/1706.04902 (2017)
Google Scholar
Sebastiani, F.: Machine learning in automated text categorization. ACM Comput. Surv. (CSUR) 34(1), 1–47 (2002)
Article Google Scholar
Shen, G., et al.: Depression detection via harvesting social media: a multimodal dictionary learning solution. In: IJCAI, pp. 3838–3844 (2017)
Google Scholar
Stankevich, M., Isakov, V., Devyatkin, D., Smirnov, I.: Feature engineering for depression detection in social media. In: ICPRAM, pp. 426–431 (2018)
Google Scholar
Tsugawa, S., Kikuchi, Y., Kishino, F., Nakajima, K., Itoh, Y., Ohsaki, H.: Recognizing depression from Twitter activity. In: Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems, pp. 3187–3196 (2015)
Google Scholar
Wei, B., Pal, C.: Cross lingual adaptation: an experiment on sentiment classifications. In: Proceedings of the ACL 2010 Conference Short Papers, pp. 258–262. Association for Computational Linguistics (2010)
Google Scholar
Wolohan, J., Hiraga, M., Mukherjee, A., Sayyed, Z.A., Millard, M.: Detecting linguistic traces of depression in topic-restricted text: attending to self-stigmatized depression with NLP. In: Proceedings of the First International Workshop on Language Cognition and Computational Models, pp. 11–21 (2018)
Google Scholar
Yang, X., McCreadie, R., Macdonald, C., Ounis, I.: Transfer learning for multi-language Twitter election classification. In: Proceedings of the 2017 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, pp. 341–348 (2017)
Google Scholar

Download references

Acknowledgments

This work was partially supported by CONACYT under scholarship 869498, postdoctoral fellowship CVU-174410 and project grant CB-2015-01-257383.

Author information

Authors and Affiliations

Instituto Nacional de Astrofísica, Óptica y Electrónica, Santa María Tonantzintla, Puebla, Mexico
Laritza Coello-Guilarte, Rosa María Ortega-Mendoza, Luis Villaseñor-Pineda & Manuel Montes-y-Gómez

Authors

Laritza Coello-Guilarte
View author publications
You can also search for this author in PubMed Google Scholar
Rosa María Ortega-Mendoza
View author publications
You can also search for this author in PubMed Google Scholar
Luis Villaseñor-Pineda
View author publications
You can also search for this author in PubMed Google Scholar
Manuel Montes-y-Gómez
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Laritza Coello-Guilarte .

Editor information

Editors and Affiliations

Universita della Svizzera Italiana, Lugano, Switzerland
Fabio Crestani
Zurich University of Applied Sciences, Winterthur, Switzerland
Martin Braschler
University of Neuchâtel, Neuchâtel, Switzerland
Jacques Savoy
Technische Universität Wien, Vienna, Austria
Andreas Rauber
HES-SO Valais-Wallis, Sierre, Switzerland
Henning Müller
University of Santiago de Compostela, Santiago de Compostela, Spain
David E. Losada
Swiss Alliance for Data-Intensive Services, Thun, Switzerland
Gundula Heinatz Bürki
University of Padua, Padua, Italy
Linda Cappellato
University of Padua, Padua, Italy
Nicola Ferro

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Coello-Guilarte, L., Ortega-Mendoza, R.M., Villaseñor-Pineda, L., Montes-y-Gómez, M. (2019). Crosslingual Depression Detection in Twitter Using Bilingual Word Alignments. In: Crestani, F., et al. Experimental IR Meets Multilinguality, Multimodality, and Interaction. CLEF 2019. Lecture Notes in Computer Science(), vol 11696. Springer, Cham. https://doi.org/10.1007/978-3-030-28577-7_2

Download citation

DOI: https://doi.org/10.1007/978-3-030-28577-7_2
Published: 03 August 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-28576-0
Online ISBN: 978-3-030-28577-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics