TOPIE: An Open-Source Opinion Mining Pipeline to Analyze Consumers’ Sentiment in Brazilian Portuguese

  • Ellen SouzaEmail author
  • Tiago Alves
  • Ingryd Teles
  • Adriano L. I. Oliveira
  • Cristine Gusmão
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9727)


The growth of social media and user-generated content (UGC) on the Internet provides a huge quantity of information that allows discovering the experiences, opinions, and feelings of users or customers. These electronic Word of Mouth statements expressed on the web are prevalent in business and service industry to enable a customer to share his/her point of view. However, it is impossible for humans to fully understand it in a reasonable amount of time. Opinion mining (also known as Sentiment Analysis) is a sub-field of text mining in which the main task is to extract opinions from UGC. Thus, this work presents an open source pipeline to analyze the costumer’s opinion or sentiment in Twitter about products and services offered by Brazilian companies. The pipeline is based on General Architecture for Text Engineering (GATE) framework and the proposed hybrid method combines lexicon-based, supervised learning, and rule-based approaches. Case studies performed on Twitter real data achieved precision of almost 70 %.


Text mining Text classification Opinion mining Sentiment analysis Portuguese language GATE 


  1. 1.
    Marine-Roig, E., Anton Clavé, S.: Tourism analytics with massive user-generated content: A case study of Barcelona. J. Destin. Mark. Manag., 1–11 (2015)Google Scholar
  2. 2.
    Ravi, K., Ravi, V.: A survey on opinion mining and sentiment analysis: Tasks, approaches and applications. Knowl.-Based Syst. 89, 14–26 (2015)CrossRefGoogle Scholar
  3. 3.
    Evangelista, T.R., Padilha, T.P.P.: Monitoramento de Posts Sobre Empresas de E-Commerce em Redes Sociais Utilizando Análise de Sentimentos (2013)Google Scholar
  4. 4.
    Balazs, J.A., Velásquez, J.D.: Opinion mining and information fusion: a survey. Inf. Fusion 27, 95–110 (2016)CrossRefGoogle Scholar
  5. 5.
    Hotho, A., Andreas, N., Paaß, G., Augustin, S.: A brief survey of text mining, 1–37 (2005)Google Scholar
  6. 6.
    Liu, B., Zhang, L.: Chapter 1 A survey of opinion mining and sentiment analysis, pp. 1–49Google Scholar
  7. 7.
    Poblete, B., Garcia, R., Mendoza, M., Jaimes, A.: Do all birds tweet the same? characterizing twitter around the world categories and subject descriptors. Society, 1025–1030 (2011)Google Scholar
  8. 8.
    Cunningham, H., Maynard, D., Bontcheva, K., Tablan, V., Ursu, C., Dimitrov, M., Dowman, M., Aswani, N.: Developing language processing components with GATE (a user guide). Univ. Sheff. 2006, 1–457 (2001)Google Scholar
  9. 9.
    Pereira, A.: Sentiment analysis for streams of web data: a case study of Brazilian financial markets, pp. 167–170Google Scholar
  10. 10.
    Pang, B., Lee, L.: Opinion Mining and Sentiment Analysis. Found. Trends® Inf. Retr. 2, 1–135 (2008)CrossRefGoogle Scholar
  11. 11.
    Nascimento, P., Aguas, R., Kong, X., Osiek, B., De Souza, J.: Análise de sentimento de tweets com foco em notícias (2009)Google Scholar
  12. 12.
    Avanco, L.V., Nunes, M.D.G.V.: Lexicon-based sentiment analysis for reviews of products in Brazilian Portuguese. In: 2014 Brazilian Conference on Intelligent Systems, pp. 277–281 (2014)Google Scholar
  13. 13.
    De Freitas, L.A., Vieira, R.: Ontology-based feature level opinion mining for Portuguese reviews. In: WWW 2013 Companion - Proceedings of the 22nd International Conference on World Wide Web, pp. 367–370 (2013)Google Scholar
  14. 14.
    Chaves, M.S., de Freitas, L.A., Souza, M., Vieira, R.: PIRPO: an algorithm to deal with polarity in Portuguese online reviews from the accommodation sector. In: Bouma, G., Ittoo, A., Métais, E., Wortmann, H. (eds.) NLDB 2012. LNCS, vol. 7337, pp. 296–301. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  15. 15.
    Rosa, R.L., Rodriguez, D.Z., Bressan, G.: SentiMeter-Br: a new social web analysis metric to discover consumers’ sentiment. In: Proceedings of the International Symposium on Consumer Electronics, ISCE, pp. 153–154 (2013)Google Scholar
  16. 16.
    Rosa, R.L., Rodriguez, D.Z., Bressan, G.: SentiMeter-Br: a social web analysis tool to discover consumers’ sentiment. In: Proceedings of the IEEE International Conference on Mobile Data Management, vol. 2, pp. 122–124 (2013)Google Scholar
  17. 17.
    dos Santos, F.L., Ladeira, M.: The role of text pre-processing in opinion mining on a social media language dataset (2014)Google Scholar
  18. 18.
    Graciela, A., Becker, K., Moreira, V.: Um estudo de caso de mineração de emoções em textos multilíngues (2012)Google Scholar
  19. 19.
    Aranha, C., Passos, E.: A tecnologia de mineraçao de textos. Rev. Sist. Sist. Informação 5, 1–8 (2006)Google Scholar
  20. 20.
    Bontcheva, K., Derczynski, L., Funk, A.: TwitIE: an open-source information extraction pipeline for microblog text. In: RANLP (2013)Google Scholar
  21. 21.
    Alves, A.L.F., Grande, C., Grande, C., Firmino, A.A., De Oliveira, M.G., De Paiva, A.C.: A comparison of SVM versus naive-bayes techniques for sentiment analysis in tweets: a case study with the 2013 FIFA confederations cup categories and subject descriptors, pp. 123–130 (2014)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • Ellen Souza
    • 1
    • 2
    Email author
  • Tiago Alves
    • 1
  • Ingryd Teles
    • 1
  • Adriano L. I. Oliveira
    • 2
  • Cristine Gusmão
    • 3
  1. 1.MiningBR Research GroupFederal Rural University of Pernambuco (UFRPE)Serra TalhadaBrazil
  2. 2.Centro de InformáticaFederal University of Pernambuco (CIn-UFPE)RecifeBrazil
  3. 3.Programa de Pós-graduação em Engenharia Biomédica, Centro de Tecnologia e GeociênciasFederal University of Pernambuco (CTG-UFPE)RecifeBrazil

Personalised recommendations