Skip to main content

Word Embeddings and Deep Learning for Spanish Twitter Sentiment Analysis

  • 617 Accesses

Part of the Communications in Computer and Information Science book series (CCIS,volume 898)

Abstract

Spanish is the third language most used on the internet. However, Natural Language Processing research in this language is still far below the level of other languages like English. The aim of this paper is to fill this gap in the literature and to provide a comprehensive assessment of Deep Learning applied to Spanish sentiment analysis. We focus on the polarity detection task which, in the context of Spanish Twitter messages, remains as a challenging task. To do so, we explore the combination of several Word representations (Word2Vec, Glove, Fastext) and Deep Neural Networks models. Unlike poor performance obtained by previous related work using Deep Learning for Spanish sentiment analysis, we show promising results. Our best setting combines three word embeddings representations, Convolutional Neural Networks and Recurrent Neural Networks. This setup allows us to obtain state-of-the-art results on the TASS/SEPLN 2017 Spanish Twitter benchmark dataset, in terms of accuracy and macro F1-measure.

Keywords

  • Spanish sentiment analysis
  • Deep learning
  • Word embeddings

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-030-11680-4_4
  • Chapter length: 13 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   79.99
Price excludes VAT (USA)
  • ISBN: 978-3-030-11680-4
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   99.99
Price excludes VAT (USA)
Fig. 1.

Notes

  1. 1.

    The following tool was used to perform POS tagging: http://www.cis.uni-muenchen.de/~schmid/tools/TreeTagger/.

  2. 2.

    https://www.tensorflow.org/.

References

  1. Araque, O., Barbado, R., Sanchez-Rada, J.F., Iglesias, C.A.: Applying recurrent neural networks to sentiment analysis of Spanish tweets. In: Proceedings of TASS 2017: Workshop on Sentiment Analysis at SEPLN, pp. 71–76 (2017)

    Google Scholar 

  2. Bengio, Y., Ducharme, R., Vincent, P., Janvin, C.: A neural probabilistic language model. J. Mach. Learn. Res. 3, 1137–1155 (2003). http://dl.acm.org/citation.cfm?id=944919.944966

    MATH  Google Scholar 

  3. Blair-goldensohn, S., Neylon, T., Hannan, K., Reis, G.A., Mcdonald, R., Reynar, J.: Building a sentiment summarizer for local service reviews. In: NLP in the Information Explosion Era (2008)

    Google Scholar 

  4. Bojanowski, P., Grave, E., Joulin, A., Mikolov, T.: Enriching word vectors with subword information. Trans. Assoc. Comput. Linguist. 5, 135–146 (2017)

    CrossRef  Google Scholar 

  5. Brody, S., Elhadad, N.: An unsupervised aspect-sentiment model for online reviews. In: Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, HLT 2010, pp. 804–812. Association for Computational Linguistics, Stroudsburg (2010). http://dl.acm.org/citation.cfm?id=1857999.1858121

  6. Brooke, J., Tofiloski, M., Taboada, M.: Cross-linguistic sentiment analysis: from English to Spanish. In: Proceedings of RANLP 2009, pp. 50–54 (2009)

    Google Scholar 

  7. Ceron-Guzman, J.A.: Classier ensembles that push the state-of-the-art in sentiment analysis of Spanish tweets. In: Proceedings of TASS 2017: Workshop on Sentiment Analysis at SEPLN, pp. 59–64 (2017)

    Google Scholar 

  8. Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., Kuksa, P.P.: Natural language processing (almost) from scratch. CoRR abs/1103.0398 (2011). http://arxiv.org/abs/1103.0398

  9. Elman, J.L.: Finding structure in time. Cogn. Sci. 14(2), 179–211 (1990)

    CrossRef  Google Scholar 

  10. Garcia, M., Martinez, E., Villena, J., Garcia, J.: TASS 2015 - the evolution of the spanish opinion mining systems. Procesamiento de Lenguaje Natural 56, 33–40 (2016)

    Google Scholar 

  11. Garcia-Cumbreras, M.A., Villena-Roman, J., Martinez-Camara, E., Diaz-Galiano, M., Martin-Valdivia, T., Ureña Lopez, A.: Overview of TASS 2016. In: Proceedings of TASS 2016: Workshop on Sentiment Analysis at SEPLN, pp. 13–21 (2016)

    Google Scholar 

  12. Garcia-Vega, M., Montejo-Raez, A., Diaz-Galiano, M.C., Jimenez-Zafra, S.M.: Sinai in TASS 2017: tweet polarity classification integrating user information. In: Proceedings of TASS 2017: Workshop on Sentiment Analysis at SEPLN, pp. 91–96 (2017)

    Google Scholar 

  13. Graves, A.: Supervised Sequence Labelling with Recurrent Neural Networks. Studies in Computational Intelligence, vol. 385. Springer, Berlin (2012). https://doi.org/10.1007/978-3-642-24797-2. https://cds.cern.ch/record/1503877

    CrossRef  MATH  Google Scholar 

  14. Hurtado, L.F., Pla, F., Gonzalez, J.A.: ELiRF-UPV at TASS 2017: sentiment analysis in twitter based on deep learning. In: Proceedings of TASS 2017: Workshop on Sentiment Analysis at SEPLN, pp. 29–34 (2017)

    Google Scholar 

  15. Kim, Y.: Convolutional neural networks for sentence classification. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, EMNLP 2014, Doha, Qatar, 25–29 October 2014, A meeting of SIGDAT, a Special Interest Group of the ACL, pp. 1746–1751 (2014). http://aclweb.org/anthology/D/D14/D14-1181.pdf

  16. LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)

    CrossRef  Google Scholar 

  17. Liu, B.: Sentiment Analysis and Opinion Mining. Morgan and Claypool Publishers (2012)

    Google Scholar 

  18. Martinez-Camara, E., Diaz-Galiano, M., Garcia-Cumbreras, M.A., Garcia-Vega, M., Villena-Roman, J.: Overview of TASS 2017. In: Proceedings of TASS 2017: Workshop on Sentiment Analysis at SEPLN, pp. 13–21 (2017)

    Google Scholar 

  19. McGlohon, M., Glance, N., Reiter, Z.: Star quality: Aggregating reviews to rank products and merchants. In: Proceedings of Fourth International Conference on Weblogs and Social Media (ICWSM) (2010)

    Google Scholar 

  20. Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Burges, C.J.C., Bottou, L., Welling, M., Ghahramani, Z., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems 26, pp. 3111–3119. Curran Associates, Inc. (2013). http://papers.nips.cc/paper/5021-distributed-representations-of-words-and-phrases-and-their-compositionality.pdf

  21. Moreno-Ortiz, A., Perez-Hernendez, C.: Tecnolengua Lingmotif at TASS 2017: Spanish twitter dataset classification combining wide-coverage lexical resources and text features. In: Proceedings of TASS 2017: Workshop on Sentiment Analysis at SEPLN, pp. 35–42 (2017)

    Google Scholar 

  22. Narayanan, V., Arora, I., Bhatia, A.: Fast and accurate sentiment classification using an enhanced naive bayes model. In: Yin, H., et al. (eds.) IDEAL 2013. LNCS, vol. 8206, pp. 194–201. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-41278-3_24

    CrossRef  Google Scholar 

  23. Neubig, G.: Neural machine translation and sequence-to-sequence models: A tutorial. CoRR abs/1703.01619 (2017). http://arxiv.org/abs/1703.01619

  24. Pang, B., Lee, L.: Opinion mining and sentiment analysis. Found. Trends Inf. Retr. 2(1–2), 1–135 (2008). https://doi.org/10.1561/1500000011

    CrossRef  Google Scholar 

  25. Paredes-Valverde, M.A., Colomo-Palacios, R., Salas-Zarate, M.D.P., Valencia-Garcia, R.: Sentiment analysis in Spanish for improvement of products and services: a deep learning approach. Sci. Program. 6, 1–6 (2017)

    Google Scholar 

  26. Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. In: Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543 (2014). http://www.aclweb.org/anthology/D14-1162

  27. Rosa, A., Chiruzzo, L., Etcheverry, M., Castro, S.: RETUYT in TASS 2017: sentiment analysis for Spanish tweets using SVM and CNN. In: Proceedings of TASS 2017: Workshop on Sentiment Analysis at SEPLN, pp. 77–83 (2017)

    Google Scholar 

  28. Segura-Bedmar, I., Quiros, A., Martínez, P.: Exploring convolutional neural networks for sentiment analysis of Spanish tweets. In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: vol. 1, Long Papers, pp. 1014–1022. Association for Computational Linguistics (2017). http://aclweb.org/anthology/E17-1095

  29. Tang, D., Wei, F., Qin, B., Yang, N., Liu, T., Zhou, M.: Sentiment embeddings with applications to sentiment analysis. IEEE Trans. Knowl. Data Eng. 28(2), 496–509 (2016)

    CrossRef  Google Scholar 

  30. Tang, D., Qin, B., Liu, T.: Deep learning for sentiment analysis: successful approaches and future challenges. Wiley Interdisc. Rev.: Data Mining Knowl. Discov. 5(6), 292–303 (2015)

    Google Scholar 

  31. Turney, P.D.: Thumbs up or thumbs down?: semantic orientation applied to unsupervised classification of reviews. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, ACL 2002, pp. 417–424. Association for Computational Linguistics, Stroudsburg (2002). https://doi.org/10.3115/1073083.1073153

  32. Vilares, D., Doval, Y., Alonso, M.A., Gomez-Rodriguez, C.: LyS at TASS 2015: deep learning experiments for sentiment analysis on Spanish tweets. In: Proceedings of TASS 2015: Workshop on Sentiment Analysis at SEPLN, pp. 47–52 (2015)

    Google Scholar 

  33. Zhang, L., Wang, S., Liu, B.: Deep learning for sentiment analysis: a survey. CoRR abs/1801.07883 (2018). http://arxiv.org/abs/1801.07883

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to José Ochoa-Luna .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Verify currency and authenticity via CrossMark

Cite this paper

Ochoa-Luna, J., Ari, D. (2019). Word Embeddings and Deep Learning for Spanish Twitter Sentiment Analysis. In: Lossio-Ventura, J., Muñante, D., Alatrista-Salas, H. (eds) Information Management and Big Data. SIMBig 2018. Communications in Computer and Information Science, vol 898. Springer, Cham. https://doi.org/10.1007/978-3-030-11680-4_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-11680-4_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-11679-8

  • Online ISBN: 978-3-030-11680-4

  • eBook Packages: Computer ScienceComputer Science (R0)