Skip to main content

Comparing Approaches to Subjectivity Classification: A Study on Portuguese Tweets

  • Conference paper
  • First Online:
Computational Processing of the Portuguese Language (PROPOR 2016)

Abstract

In this paper, we compare lexicon-based and machine learning-based approaches to define the subjectivity of tweets in Portuguese. We tested SentiLex and WordAffectBR lexicons, and Sequential Machine Optimization and Naive Bayes algorithms for this task. In our study, we used the Computer-BR corpus that contains messages about the technology area. We obtained better results using the Comprehensive Measurement Feature Selection method and the Sequential Machine Optimization algorithm as the classifier. We achieved considerable accuracy when we included the polarities of words in the vector space model of tweets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://beta.visl.sdu.dk/visl/pt/.

  2. 2.

    This heuristic is a small variation of the strategy proposed in [9].

  3. 3.

    http://www.cs.waikato.ac.nz/ml/weka/.

References

  1. Feldman, R.: Techniques and applications for sentiment analysis. Commun. ACM 56(4), 82–89 (2013)

    Article  Google Scholar 

  2. Dale, R., Moisl, H., Somers, H. (eds.): Handbook of Natural Language Processing. CRC Press, Boca Raton (2000)

    Google Scholar 

  3. Kamal, A.: Subjectivity Classification using Machine Learning Techniques for Mining Feature-Opinion Pairs from Web Opinion Sources (2013). arXiv preprint arXiv:1312.6962

  4. Fersini, E., Messina, E., Pozzi, F.A.: Subjectivity, polarity and irony detection: a multi-layer approach. In: Proceedings of the First Italian Conference on Computational Linguistics CLiC-it 2014 & the Fourth International Workshop EVALITA (2014)

    Google Scholar 

  5. Drury, B., de Andrade Lopes, A.: A comparison of the effect of feature selection and balancing strategies upon the sentiment classification of Portuguese news stories. In: Proceedings of ENIAC (2014)

    Google Scholar 

  6. Santos, A.P., Ramos, C., Marques, N.C.: Sentiment classification of Portuguese news headlines. Int. J. Softw. Eng. Appl. 9(9), 9–18 (2015)

    Google Scholar 

  7. Rosa, R.L., Rodríguez, D.Z., Bressan, G.: SentiMeter-Br: a social web analysis tool to discover consumers’ sentiment. In: 2013 IEEE 14th International Conference on Mobile Data Management (MDM), vol. 2, pp. 122–124. IEEE (2013)

    Google Scholar 

  8. Morgado, I.C.: Classification of sentiment polarity of Portuguese on-line news. In: Proceedings of the 7th Doctoral Symposium in Informatics Engineering, pp. 139–150 (2012)

    Google Scholar 

  9. Filho, P.P.B., Pardo, T.A., Aluısio, S.M.: An evaluation of the Brazilian Portuguese liwc dictionary for sentiment analysis. In: 9th Brazilian Symposium in Information and Human Language Technology, Fortaleza, Ceara (2013)

    Google Scholar 

  10. Medhat, W., Hassan, A., Korashy, H.: Sentiment analysis algorithms and applications: a survey. Ain Shams Eng. J. 5(4), 1093–1113 (2014)

    Article  Google Scholar 

  11. Carvalho, P., Silva, M.J.: SentiLex-PT: principais características e potencialidades. Oslo Stud. Lang. 7(1), 425–438 (2015)

    MathSciNet  Google Scholar 

  12. Pasqualotti, P.R., Vieira, R.: WordnetAffectBR: uma base lexical de palavras de emoções para a língua Portuguesa. RENOTE 6, 1–10 (2008)

    Google Scholar 

  13. Généreux, M., Martinez, W.: Contrasting objective and subjective Portuguese texts from heterogeneous sources. In: Proceedings of the Workshop on Innovative Hybrid Approaches to the Processing of Textual Data, pp. 46–51. Association for Computational Linguistics (2012)

    Google Scholar 

  14. Moraes, S., Silveira, M., Manssour, I.: 7x1-PT: um Corpus extraído do Twitter para Análise de Sentimentos em Língua Portuguesa. BRACIS, STIL (2015)

    Google Scholar 

  15. Yang, J., Liu, Y., Zhu, X., Liu, Z., Zhang, X.: A new feature selection based on comprehensive measurement both in inter-category and intra-category for text categorization. Inf. Process. Manage. 48(4), 741–754 (2012)

    Article  Google Scholar 

  16. Yang, J., Qu, Z., Liu, Z.: Improved feature-selection method considering the imbalance problem in text categorization. Sci. World J. (2014)

    Google Scholar 

  17. Souza, M., Vieira, R.: Sentiment analysis on twitter data for Portuguese language. In: Caseli, H., Villavicencio, A., Teixeira, A., Perdigão, F. (eds.) PROPOR 2012. LNCS, vol. 7243, pp. 241–247. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  18. Lambov, D., Dias, G., Noncheva, V.: High-level features for learning subjective language across domains. In: Proceedings of International AAAI Conference on Weblogs and Social Media ICWSM (2009)

    Google Scholar 

Download references

Acknowledgments

Our thanks to Dell for the financial support of this work.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to André L. L. Santos .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Moraes, S.M.W., Santos, A.L.L., Redecker, M., Machado, R.M., Meneguzzi, F.R. (2016). Comparing Approaches to Subjectivity Classification: A Study on Portuguese Tweets. In: Silva, J., Ribeiro, R., Quaresma, P., Adami, A., Branco, A. (eds) Computational Processing of the Portuguese Language. PROPOR 2016. Lecture Notes in Computer Science(), vol 9727. Springer, Cham. https://doi.org/10.1007/978-3-319-41552-9_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-41552-9_8

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-41551-2

  • Online ISBN: 978-3-319-41552-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics