Skip to main content

A Comparison Between Two Spanish Sentiment Lexicons in the Twitter Sentiment Analysis Task

Part of the Lecture Notes in Computer Science book series (LNAI,volume 10022)

Abstract

Sentiment analysis aims to determine people’s opinions towards certain entities (e.g., products, movies, people, etc.). In this paper we describe experiments performed to determine sentiment polarity on tweets of the Spanish corpus used in the TASS workshop. We explore the use of two Spanish sentiment lexicons to find out the effect of these resources in the Twitter sentiment analysis task. Rule based and supervised classification methods were implemented and several variations over those approaches were performed. The results show that the information of both lexicons improve the accuracy when is provided as a feature to a Naïve Bayes classifier. Despite the simplicity of the proposed strategy, the supervised approach obtained better results than several participant teams of the TASS workshop and even the rule based approach overpass the accuracy of one team which used a supervised algorithm.

Keywords

  • Sentiment Analysis
  • Training Corpus
  • Rule Base Approach
  • Supervise Approach
  • Negative Sentiment

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-319-47955-2_11
  • Chapter length: 12 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   64.99
Price excludes VAT (USA)
  • ISBN: 978-3-319-47955-2
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   84.99
Price excludes VAT (USA)

Notes

  1. 1.

    All the words in the SEL lexicon are lemmatized.

  2. 2.

    Experimentally better results were achivied when using all the categories instead of using only the affective process category like in the rule based approach.

References

  1. Taboada, M., Brooke, J., Tofiloski, M., Voll, K.D., Stede, M.: Lexicon-based methods for sentiment analysis. Comput. Linguist. 37, 267–307 (2011)

    CrossRef  Google Scholar 

  2. Baccianella, S., Esuli, A., Sebastiani, F.: SentiWordNet 3.0: an enhanced lexical resource for sentiment analysis and opinion mining. In: Calzolari, N., Choukri, K., Maegaard, B., Mariani, J., Odijk, J., Piperidis, S., Rosner, M., Tapias, D. (eds.) LREC. European Language Resources Association (2010)

    Google Scholar 

  3. Stone, P.J.: The General Inquirer: A Computer Approach to Content Analysis. The MIT Press, Cambridge (1966)

    Google Scholar 

  4. Tausczik, Y.R., Pennebaker, J.W.: The psychological meaning of words: LIWC and computerized text analysis methods. J. Lang. Soc. Psychol. 29, 24–54 (2010)

    CrossRef  Google Scholar 

  5. Pang, B., Lee, L., Vaithyanathan, S.: Thumbs up? Sentiment classification using machine learning techniques. In: EMNLP 2002, Philadelphia, Pennsylvania pp. 79–86 (2002)

    Google Scholar 

  6. Urizar, X.S., Roncal, I.S.V.: Detecting sentiments in Spanish tweets. TASS 2012 Working Notes (2012)

    Google Scholar 

  7. Sidorov, G., et al.: Empirical study of machine learning based approach for opinion mining in tweets. In: Batyrshin, I., González Mendoza, M. (eds.) MICAI 2012, Part I. LNCS, vol. 7629, pp. 1–14. Springer, Heidelberg (2013)

    CrossRef  Google Scholar 

  8. Villena-Román, J., García-Morera, J., Cumbreras, M., Martínez-Cámara, E., Martín-Valdivia, M.T., López, L.A.U.: Overview of TASS 2015. In: Villena-Román, J., García-Morera, J., Cumbreras, M.Á.G., Martínez-Cámara, E., Martín-Valdivia, M.T., López, L.A.U. (eds.) TASS@SEPLN, CEUR Workshop Proceedings, vol. 1397, pp. 13–21 (2015)

    Google Scholar 

  9. Garcıa, D., Thelwall, M.: Political alignment and emotional expression in Spanish Tweets. In: Proceedings of the TASS Workshop at SEPLN, pp. 151–159 (2013)

    Google Scholar 

  10. Moreno-Ortiz, A., Pérez Hernández, C.: Lexicon-based sentiment analysis of twitter messages in Spanish. Procesamiento del Lenguaje Natural 50, 93–100 (2013)

    Google Scholar 

  11. Urizar, J., San Vicente Roncal, I.: Elhuyar at TASS 2013. In: Proceedings of the TASS Workshop at SEPLN (2013)

    Google Scholar 

  12. Araque, O., Corcuera, I., Román, C., Iglesias, C.A., Sánchez-Rada, J.F.: Aspect based sentiment analysis of Spanish tweets. In: Villena-Román, J., García-Morera, J., Cumbreras, M.Á.G., Martínez-Cámara, E., Martín-Valdivia, M.T., López, L.A.U. (eds.): TASS@SEPLN, CEUR Workshop Proceedings, vol. 1397, pp. 29–34 (2015). CEUR-WS.org

  13. Valverde, T.J., Tejada, C.J.: Comparing supervised learning methods for classifying Spanish tweets. In: Villena-Román, J., García-Morera, J., Cumbreras, M.Á.G., Martínez-Cámara, E., Martín-Valdivia, M.T., López, L.A.U. (eds.) TASS@SEPLN, CEUR Workshop Proceedings, vol. 1397, pp. 87–92 (2015). CEUR-WS.org

  14. Rangel, I.D., Guerra, S.S., Sidorov, G.: Creación y evaluación de un diccionario marcado con emociones y ponderado para el español. Onomazein 29, 31–46 (2014)

    CrossRef  Google Scholar 

  15. Pennebaker, J.W., Francis, M.E., Booth, R.J.: Linguistic Inquiry and Word Count. Lawerence Erlbaum Associates, Mahwah (2001)

    Google Scholar 

  16. Cámara, E.M., Cumbreras, M., Martín-Valdivia, M.T., López, L.A.U.: SINAI-EMMA: Vectores de Palabras para el Análisis de Opiniones en Twitter. In: Villena-Román, J., García-Morera, J., Cumbreras, M.Á.G., Martínez-Cámara, E., Martín-Valdivia, M.T., López, L.A.U. (eds.) TASS@SEPLN, CEUR Workshop Proceedings, vol. 1397, pp. 41–46 (2015). CEUR-WS.org

  17. del Pilar Salas-Zárate, M., López-López, E., Valencia-García, R., Aussenac-Gilles, N., Almela, Á., Alor-Hernández, G.: A study on LIWC categories for opinion mining in Spanish reviews. J. Inf. Sci. 40, 749–760 (2014)

    CrossRef  Google Scholar 

  18. Vázquez, S., Bel, N.: A classification of adjectives for polarity lexicons enhancement. In: Calzolari, N., Choukri, K., Declerck, T., Dogan, M.U., Maegaard, B., Mariani, J., Odijk, J., Piperidis, S. (eds.) LREC, pp. 3557–3561. European Language Resources Association (ELRA) (2012)

    Google Scholar 

  19. Padró, L., Stanilovsky, E.: FreeLing 3.0: towards wider multilinguality. In: Proceedings of the Language Resources and Evaluation Conference (LREC 2012). ELRA, Istanbul (2012)

    Google Scholar 

  20. Hurtado, L.F., Plà, F., Buscaldi, D.: ELiRF-UPV en TASS 2015: Análisis de Sentimientos en Twitter. In: Villena-Román, J., García-Morera, J., Cumbreras, M.Á.G., Martínez-Cámara, E., Martín-Valdivia, M.T., López, L.A.U. (eds.) TASS@SEPLN, CEUR Workshop Proceedings, vol. 1397, pp. 75–79 (2015). CEUR-WS.org

  21. Álvarez-López, T., Juncal-Martínez, J., Gavilanes, M.F., Costa-Montenegro, E., González-Castaño, F.J., Cerezo-Costas, H., Celix-Salgado, D.: GTI-Gradiant at TASS 2015: a hybrid approach for sentiment analysis in twitter. In: Villena-Román, J., García-Morera, J., Cumbreras, M.Á.G., Martínez-Cámara, E., Martín-Valdivia, M.T., López, L.A.U. (eds.) TASS@SEPLN, CEUR Workshop Proceedings, vol. 1397, pp. 35–40 (2015). CEUR-WS.org

Download references

Acknowledgments

We thank the support of Instituto Politécnico Nacional (IPN), ESCOM-IPN, CIC-IPN, SIP-IPN projects number 20160815, 20162058, COFAA-IPN, and EDI-IPN.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Omar Juárez Gambino .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2016 Springer International Publishing AG

About this paper

Cite this paper

Gambino, O.J., Calvo, H. (2016). A Comparison Between Two Spanish Sentiment Lexicons in the Twitter Sentiment Analysis Task. In: Montes y Gómez, M., Escalante, H., Segura, A., Murillo, J. (eds) Advances in Artificial Intelligence - IBERAMIA 2016. IBERAMIA 2016. Lecture Notes in Computer Science(), vol 10022. Springer, Cham. https://doi.org/10.1007/978-3-319-47955-2_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-47955-2_11

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-47954-5

  • Online ISBN: 978-3-319-47955-2

  • eBook Packages: Computer ScienceComputer Science (R0)