Skip to main content

Crosslingual Depression Detection in Twitter Using Bilingual Word Alignments

  • Conference paper
  • First Online:
Experimental IR Meets Multilinguality, Multimodality, and Interaction (CLEF 2019)

Abstract

Depression is a mental disorder with strong social and economic implications. Due to its relevance, recently several researches have explored the analysis of social media content to identify and track depressed users. Most approaches follow a supervised learning strategy supported on the availability of labeled training data. Unfortunately, acquiring such data is very complex and costly. To handle this problem, in this paper we propose a crosslingual approach based on the idea that data already labeled in a specific language can be leveraged to classify depression in other languages. The proposed method is based on a word-level alignment process. Particularly, we propose two representations for the alignment; one of them takes advantage of the psycholinguistic resource LIWC and the other uses bilingual word embeddings. For evaluating the proposed approach, we faced the detection of depression by employing English and Spanish tweets as the source and target data respectively. The results outperformed solutions based on automatic translation of texts, confirming the usefulness of the proposed approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 79.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    In the experiments, the translation of the seed words was done by means of Google Translator.

  2. 2.

    We are interested in using content features, therefore the stop words are ignored.

  3. 3.

    https://ccc.inaoep.mx/~mmontesg/resources/CrossLingualDepression.zip.

  4. 4.

    English data was taken from [20] and Spanish data was taken from [3].

  5. 5.

    An offensive term in Spanish.

References

  1. Abdalla, M., Hirst, G.: Cross-lingual sentiment analysis without (good) translation. arXiv preprint arXiv:1707.01626 (2017)

  2. Al-Shabi, A., Adel, A., Omar, N., Al-Moslmi, T.: Cross-lingual sentiment classification from english to arabic using machine translation. Int. J. Adv. Comput. Sci. Appl. (IJACSA) 8(12), 434–440 (2017)

    Google Scholar 

  3. Álvarez-Carmona, M.Á.: Author profiling in social media with multimodal information. Ph.D. thesis, Instituto Nacional de Astrofísica, Óptica y Electrónica (2019)

    Google Scholar 

  4. Artetxe, M., Labaka, G., Agirre, E.: Learning bilingual word embeddings with (almost) no bilingual data. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), vol. 1, pp. 451–462 (2017)

    Google Scholar 

  5. Coppersmith, G., Dredze, M., Harman, C.: Quantifying mental health signals in Twitter. In: Proceedings of the Workshop on Computational Linguistics and Clinical Psychology: From Linguistic Signal to Clinical Reality, pp. 51–60 (2014)

    Google Scholar 

  6. Coppersmith, G., Dredze, M., Harman, C., Hollingshead, K., Mitchell, M.: Clpsych 2015 shared task: depression and PTSD on Twitter. In: Proceedings of the 2nd Workshop on Computational Linguistics and Clinical Psychology: From Linguistic Signal to Clinical Reality, pp. 31–39 (2015)

    Google Scholar 

  7. De Choudhury, M., Gamon, M., Counts, S., Horvitz, E.: Predicting depression via social media. In: Seventh International AAAI Conference on Weblogs and Social Media (2013)

    Google Scholar 

  8. Gliozzo, A., Strapparava, C.: Exploiting comparable corpora and bilingual dictionaries for cross-language text categorization. In: Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics, pp. 553–560. Association for Computational Linguistics (2006)

    Google Scholar 

  9. Guntuku, S.C., Yaden, D.B., Kern, M.L., Ungar, L.H., Eichstaedt, J.C.: Detecting depression and mental illness on social media: an integrative review. Curr. Opin. Beha. Sci. 18, 43–49 (2017)

    Article  Google Scholar 

  10. Losada, D.E., Crestani, F., Parapar, J.: eRISK 2017: CLEF lab on early risk prediction on the internet: experimental foundations. In: Jones, G.J.F., et al. (eds.) CLEF 2017. LNCS, vol. 10456, pp. 346–360. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-65813-1_30

    Chapter  Google Scholar 

  11. Losada, D.E., Crestani, F., Parapar, J.: Overview of eRisk: early risk prediction on the internet. In: Bellot, P., et al. (eds.) CLEF 2018. LNCS, vol. 11018, pp. 343–361. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-98932-7_30

    Chapter  Google Scholar 

  12. Mikolov, T., Le, Q.V., Sutskever, I.: Exploiting similarities among languages for machine translation. arXiv preprint arXiv:1309.4168 (2013)

  13. Nadeem, M.: Identifying depression on Twitter. arXiv preprint arXiv:1607.07384 (2016)

  14. Pedersen, T.: Screening Twitter users for depression and PTSD with lexical decision lists. In: Proceedings of the 2nd Workshop on Computational Linguistics and Clinical Psychology: From Linguistic Signal to Clinical Reality, pp. 46–53 (2015)

    Google Scholar 

  15. Pennebaker, J.W., Booth, R.J., Francis, M.E.: LIWC2007: linguistic inquiry and word count. LIWC.net, Austin (2007)

    Google Scholar 

  16. Prettenhofer, P., Stein, B.: Cross-language text classification using structural correspondence learning. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pp. 1118–1127 (2010)

    Google Scholar 

  17. Reece, A.G., Reagan, A.J., Lix, K.L., Dodds, P.S., Danforth, C.M., Langer, E.J.: Forecasting the onset and course of mental illness with twitter data. Sci. Rep. 7(1), 13006 (2017)

    Article  Google Scholar 

  18. Ruder, S.: A survey of cross-lingual word embedding models. CoRR abs/1706.04902 (2017)

    Google Scholar 

  19. Sebastiani, F.: Machine learning in automated text categorization. ACM Comput. Surv. (CSUR) 34(1), 1–47 (2002)

    Article  Google Scholar 

  20. Shen, G., et al.: Depression detection via harvesting social media: a multimodal dictionary learning solution. In: IJCAI, pp. 3838–3844 (2017)

    Google Scholar 

  21. Stankevich, M., Isakov, V., Devyatkin, D., Smirnov, I.: Feature engineering for depression detection in social media. In: ICPRAM, pp. 426–431 (2018)

    Google Scholar 

  22. Tsugawa, S., Kikuchi, Y., Kishino, F., Nakajima, K., Itoh, Y., Ohsaki, H.: Recognizing depression from Twitter activity. In: Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems, pp. 3187–3196 (2015)

    Google Scholar 

  23. Wei, B., Pal, C.: Cross lingual adaptation: an experiment on sentiment classifications. In: Proceedings of the ACL 2010 Conference Short Papers, pp. 258–262. Association for Computational Linguistics (2010)

    Google Scholar 

  24. Wolohan, J., Hiraga, M., Mukherjee, A., Sayyed, Z.A., Millard, M.: Detecting linguistic traces of depression in topic-restricted text: attending to self-stigmatized depression with NLP. In: Proceedings of the First International Workshop on Language Cognition and Computational Models, pp. 11–21 (2018)

    Google Scholar 

  25. Yang, X., McCreadie, R., Macdonald, C., Ounis, I.: Transfer learning for multi-language Twitter election classification. In: Proceedings of the 2017 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, pp. 341–348 (2017)

    Google Scholar 

Download references

Acknowledgments

This work was partially supported by CONACYT under scholarship 869498, postdoctoral fellowship CVU-174410 and project grant CB-2015-01-257383.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Laritza Coello-Guilarte .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Coello-Guilarte, L., Ortega-Mendoza, R.M., Villaseñor-Pineda, L., Montes-y-Gómez, M. (2019). Crosslingual Depression Detection in Twitter Using Bilingual Word Alignments. In: Crestani, F., et al. Experimental IR Meets Multilinguality, Multimodality, and Interaction. CLEF 2019. Lecture Notes in Computer Science(), vol 11696. Springer, Cham. https://doi.org/10.1007/978-3-030-28577-7_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-28577-7_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-28576-0

  • Online ISBN: 978-3-030-28577-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics