Skip to main content

Evaluating Industrial and Research Sentiment Analysis Engines on Multiple Sources

Part of the Lecture Notes in Computer Science book series (LNAI,volume 10640)

Abstract

Sentiment Analysis has a fundamental role in analyzing users opinions in all kinds of textual sources. Computing accurately sentiment expressed in huge amount of textual data is a key task largely required by the market, and nowadays industrial engines make available ready-to-use APIs for sentiment analysis-related tasks. However, building sentiment engines showing high accuracy on structurally different textual sources (e.g. reviews, tweets, blogs, etc.) is not a trivial task. Papers about cross-source evaluation lack of a comparison with industrial engines, which are instead specifically designed for dealing with multiple sources.

In this paper, we compare the results of research and industrial engines on an extensive experimental evaluation, considering the document-level polarity detection task performed on different textual sources: tweets, apps reviews and general products reviews, in both English and Italian. The experimental evaluation results help the reader to quantify the performance gap between industrial and research sentiment engines when both are tested on heterogeneous textual sources and on different languages (English/Italian). Finally, we present the results of our multi-source solution X2Check. Considering an overall cross-source average F-score on all of the results, X2Check shows a performance that is 9.1% and 5.1% higher than Google CNL, respectively on Italian and English benchmarks. Compared to the research engines, X2Check shows a F-score that is always higher than tools not specifically trained on the test set under evaluation; it is lower at most of 3.4% in Italian and 11.6% on English benchmarks, compared to the best research tools specifically trained on the target source.

Keywords

  • Sentiment analysis
  • Natural language processing
  • Machine learning
  • Experimental evaluation
  • Industrial and research tools comparison
  • Cross-domain sentiment classification

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-319-70169-1_11
  • Chapter length: 15 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   79.99
Price excludes VAT (USA)
  • ISBN: 978-3-319-70169-1
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   99.99
Price excludes VAT (USA)

Notes

  1. 1.

    https://app2check.com/performance.

References

  1. Araújo, M., Gonçalves, P., Cha, M., Benevenuto, F.: iFeel: a system that compares and combines sentiment analysis methods. In: Proceedings of WWW 2014 Companion, pp. 75–78 (2014)

    Google Scholar 

  2. Araújo, M., dos Reis, J.C., Pereira, A.M., Benevenuto, F.: An evaluation of machine translation for multilingual sentence-level sentiment analysis. In: Proceedings of the 31st Annual ACM Symposium on Applied Computing, Pisa, Italy, 4–8 April 2016, pp. 1140–1145 (2016)

    Google Scholar 

  3. Barbieri, F., Basile, V., Croce, D., Nissim, M., Novielli, N., Patti, V.: Overview of the evalita 2016 sentiment polarity classification task. In: Proceedings of CLiC-it 2016 & EVALITA 2016 (2016)

    Google Scholar 

  4. Blitzer, J., Dredze, M., Pereira, F.: Biographies, bollywood, boom-boxes and blenders: domain adaptation for sentiment classification. In: Proceedings of ACL 2007 (2007)

    Google Scholar 

  5. Bollegala, D., Mu, T., Goulermas, J.Y.: Cross-domain sentiment classification using sentiment sensitive embeddings. IEEE Trans. Knowl. Data Eng. 28(2), 398–410 (2016)

    CrossRef  Google Scholar 

  6. Di Rosa, E., Durante, A.: App2check: a machine learning-based system for sentiment analysis of app reviews in Italian language. In: Proceedings of the International Workshop on Social Media World Sensors (Sideways)- Held in conjunction with LREC 2016, pp. 8–11 (2016)

    Google Scholar 

  7. Dragoni, M., Recupero, D.R.: Challenge on fine-grained sentiment analysis within ESWC2016. In: Sack, H., Dietze, S., Tordai, A., Lange, C. (eds.) Semantic Web Challenges - Third SemWebEval Challenge at ESWC 2016, vol. 641, pp. 79–94. Springer, Heidelberg (2016)

    Google Scholar 

  8. Heredia, B., Khoshgoftaar, T.M., Prusa, J.D., Crawford, M.: Cross-domain sentiment analysis: an empirical investigation. In: Proceedings of IRI 2016, pp. 160–165 (2016)

    Google Scholar 

  9. Heredia, B., Khoshgoftaar, T.M., Prusa, J.D., Crawford, M.: Integrating multiple data sources to enhance sentiment prediction. In: Proceedings of IEEE CIC 2016, pp. 285–291 (2016)

    Google Scholar 

  10. Li, F., Wang, S., Liu, S., Zhang, M.: SUIT: a supervised user-item based topic model for sentiment analysis. In: Proceedings of AAAI 2014, pp. 1636–1642 (2014)

    Google Scholar 

  11. Liu, B.: Sentiment Analysis and Opinion Mining. Morgan & Claypool Publishers, San Rafael (2012)

    Google Scholar 

  12. Mejova, Y., Srinivasan, P.: Crossing media streams with sentiment: domain adaptation in blogs, reviews and Twitter. In: Proceedings of ICWSM 2012 (2012)

    Google Scholar 

  13. Nakov, P., Ritter, A., Sara, R., Sebastiani, F., Stoyanov, V.: Semeval-2016 task 4: sentiment analysis in Twitter. In: Proceedings of SemEval 2016. Association for Computational Linguistics (2016)

    Google Scholar 

  14. Pan, S.J., Ni, X., Sun, J., Yang, Q., Chen, Z.: Cross-domain sentiment classification via spectral feature alignment. In: Proceedings of WWW 2010, pp. 751–760 (2010)

    Google Scholar 

  15. Pang, B., Lee, L.: Opinion mining and sentiment analysis. Found. Trends Inf. Retr. 2(1–2), 1–135 (2008)

    CrossRef  Google Scholar 

  16. Rosenthal, S., Farra, N., Nakov, P.: SemEval-2017 task 4: sentiment analysis in Twitter. In: Proceedings of SemEval 2017. Association for Computational Linguistics (2017)

    Google Scholar 

  17. Täckström, O., McDonald, R.: Discovering fine-grained sentiment with latent variable structured prediction models. In: Clough, P., Foley, C., Gurrin, C., Jones, G.J.F., Kraaij, W., Lee, H., Mudoch, V. (eds.) ECIR 2011. LNCS, vol. 6611, pp. 368–374. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-20161-5_37

    CrossRef  Google Scholar 

  18. Täckström, O., McDonald, R.T.: Semi-supervised latent variable models for sentence-level sentiment analysis. In: Proceedings of HLT 2011, pp. 569–574 (2011)

    Google Scholar 

  19. Thelwall, M., Buckley, K., Paltoglou, G., Cai, D., Kappas, A.: Sentiment strength detection in short informal text. JASIST 61(12), 2544–2558 (2010)

    CrossRef  Google Scholar 

  20. Wilson, T., Wiebe, J., Hoffmann, P.: Recognizing contextual polarity: an exploration of features for phrase-level sentiment analysis. Comput. Linguist. 35(3), 399–433 (2009)

    CrossRef  Google Scholar 

  21. Wu, F., Huang, Y.: Sentiment domain adaptation with multiple sources. In: Proceedings of ACL 2016 (2016)

    Google Scholar 

  22. Wu, F., Huang, Y., Yuan, Z.: Domain-specific sentiment classification via fusing sentiment knowledge from multiple sources. Inf. Fusion 35, 26–37 (2017)

    CrossRef  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Emanuele Di Rosa .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Verify currency and authenticity via CrossMark

Cite this paper

Di Rosa, E., Durante, A. (2017). Evaluating Industrial and Research Sentiment Analysis Engines on Multiple Sources. In: Esposito, F., Basili, R., Ferilli, S., Lisi, F. (eds) AI*IA 2017 Advances in Artificial Intelligence. AI*IA 2017. Lecture Notes in Computer Science(), vol 10640. Springer, Cham. https://doi.org/10.1007/978-3-319-70169-1_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-70169-1_11

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-70168-4

  • Online ISBN: 978-3-319-70169-1

  • eBook Packages: Computer ScienceComputer Science (R0)