Skip to main content

Dictionary-Based Sentiment Analysis Applied to a Specific Domain

Part of the Communications in Computer and Information Science book series (CCIS,volume 656)

Abstract

The web and social media have been growing exponentially in recent years. We now have access to documents bearing opinions expressed on a broad range of topics. This constitutes a rich resource for natural language processing tasks, particularly for sentiment analysis. Nevertheless, sentiment analysis is usually difficult because expressed sentiments are usually topic-oriented. In this paper, we propose to automatically construct a sentiment dictionary using relevant terms obtained from web pages for a specific domain. This dictionary is initially built by querying the web with a combination of opinion terms, as well as terms of the domain. In order to select only relevant terms we apply two measures \(\textit{AcroDef}_{\textit{MI}3}\) and TrueSkill. Experiments conducted on different domains highlight that our automatic approach performs better for specific cases.

Keywords

  • Text mining
  • Web mining
  • Sentiment analysis

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-319-55209-5_5
  • Chapter length: 12 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   59.99
Price excludes VAT (USA)
  • ISBN: 978-3-319-55209-5
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   74.99
Price excludes VAT (USA)
Fig. 1.
Fig. 2.
Fig. 3.
Fig. 4.
Fig. 5.
Fig. 6.
Fig. 7.
Fig. 8.

Notes

  1. 1.

    In this paper, we use term in order to characterize linguistic features.

  2. 2.

    http://sentiwordnet.isti.cnr.it.

  3. 3.

    For simplicity, in this paper, we only report experiments that have been conducted on nouns and adjectives. Other experiments have been done by using adverbs and verbs.

  4. 4.

    http://www.cis.uni-muenchen.de/~schmid/tools/TreeTagger/.

  5. 5.

    http://www.cs.cornell.edu/People/pabo/movie-review-data/.

  6. 6.

    https://www.cs.jhu.edu/~mdredze/datasets/sentiment/.

References

  1. Amine, A., Hamou, R.M., Simonet, M.: Detecting opinions in tweets. Int. J. Data Min. Emerg. Technol. 3(1), 23–32 (2013)

    CrossRef  Google Scholar 

  2. Blitzer, J., Dredze, M., Pereira, F.: Biographies, bollywood, boomboxes and blenders: domain adaptation for sentiment classification. In: Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics (ACL 2007), pp. 187–205 (2007)

    Google Scholar 

  3. Church, K.W., Hanks, P.: Word association norms, mutual information, and lexicography. Comput. Linguist. 16, 22–29 (1990)

    Google Scholar 

  4. Daille, B.: Study, implementation of combined techniques for automatic extraction of terminology. In: Klavans, J.L., Resnik, P. (eds.) The Balancing Act: Combining Statistical and Symbolic Approaches to Language, pp. 49–66. MIT Press, Cambridge (1996)

    Google Scholar 

  5. Downey, D., Broadhead, M., Etzioni, O.: Locating complex named entities in web text. In: Proceedings of the 20th International Joint Conference on Artificial Intelligence (IJCAI 2007), pp. 2733–2739 (2007)

    Google Scholar 

  6. Duthil, B., Trousset, F., Roche, M., Dray, G., Plantié, M., Montmain, J., Poncelet, P.: Locating complex named entities in web text. In: Proceedings of the 22nd International Conference on Database and Expert Systems Applications (DEXA 2011), pp. 457–465 (2007)

    Google Scholar 

  7. Guillet, F., Hamilton, H.J.: Quality Measures in Data Mining. Springer, Heidelberg (2007)

    CrossRef  MATH  Google Scholar 

  8. Guo, S., Sanner, S., Graepel, T., Buntine, W.: Score-based Bayesian skill learning. In: Flach, P.A., Bie, T., Cristianini, N. (eds.) ECML PKDD 2012. LNCS (LNAI), vol. 7523, pp. 106–121. Springer, Heidelberg (2012). doi:10.1007/978-3-642-33460-3_12

    CrossRef  Google Scholar 

  9. Harb, A., Plantie, M., Dray, G., Roche, M., Trousset, F., Poncelet, P.: Web opinion mining: how to extract opinions from blogs? In: Proceedings of the 5th International Conference on Soft Computing as Transdisciplinary Science and Technology (CSTST 2008), pp. 211–217 (2008)

    Google Scholar 

  10. Herbrich, R., Minka, T., Graepel, T.: TrueSkill(TM): a Bayesian skill rating system. In: Advances in Neural Information Processing Systems, vol. 20, pp. 569–576. MIT Press (2007)

    Google Scholar 

  11. Marrese-Taylor, E., Velásquez, J.D., Bravo-Marquez, F., Matsuo, Y.: Identifying customer preferences about tourism products using an aspect-based opinion mining approach. Procedia Comput. Sci. 22, 182–191 (2013)

    CrossRef  Google Scholar 

  12. Pang, B., Lee, L.: A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts. In: Proceedings of the ACL, pp. 271–278 (2004)

    Google Scholar 

  13. Pang, B., Lee, L., Vaithyanathan, S.: Thumbs up?: sentiment classification using machine learning techniques. In: Proceedings of the ACL 2002 Conference on Empirical Methods in Natural Language Processing, EMNLP 2002, vol. 10, pp. 79–86. Association for Computational Linguistics, Stroudsburg (2002)

    Google Scholar 

  14. Roche, M., Prince, V.: AcroDef: a quality measure for discriminating expansions of ambiguous acronyms. In: Kokinov, B., Richardson, D.C., Roth-Berghofer, T.R., Vieu, L. (eds.) CONTEXT 2007. LNCS (LNAI), vol. 4635, pp. 411–424. Springer, Heidelberg (2007). doi:10.1007/978-3-540-74255-5_31

    CrossRef  Google Scholar 

  15. Roche, M., Prince, V.: A web-mining approach to disambiguate biomedical acronym expansions. Informatica (Slovenia) 34(2), 243–253 (2010)

    Google Scholar 

  16. Turney, P.D.: Thumbs up or thumbs down?: semantic orientation applied to unsupervised classification of reviews. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, ACL 2002, pp. 417–424. Association for Computational Linguistics, Stroudsburg (2002)

    Google Scholar 

  17. Varghese, R., Jayasree, M.: Aspect based sentiment analysis using support vector machine classifier. In: 2013 International Conference on Advances in Computing, Communications and Informatics (ICACCI), pp. 1581–1586, August 2013

    Google Scholar 

  18. Vivaldi, J., Màrquez, L., Rodríguez, H.: Improving term extraction by system combination using boosting. In: Raedt, L., Flach, P. (eds.) ECML 2001. LNCS (LNAI), vol. 2167, pp. 515–526. Springer, Heidelberg (2001). doi:10.1007/3-540-44795-4_44

    CrossRef  Google Scholar 

  19. Wang, G., Araki, K.: Modifying so-PMI for Japanese weblog opinion mining by using a balancing factor and detecting neutral expressions. In: Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics (NAACL-Short 2007), pp. 189–192 (2007)

    Google Scholar 

Download references

Acknowledgement

This work has been supported and funded by FONDECYT and SONGES project (http://textmining.biz/Projects/Songes) (FEDER and Occitanie).

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Laura Cruz , José Ochoa , Mathieu Roche or Pascal Poncelet .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Cruz, L., Ochoa, J., Roche, M., Poncelet, P. (2017). Dictionary-Based Sentiment Analysis Applied to a Specific Domain. In: Lossio-Ventura, J., Alatrista-Salas, H. (eds) Information Management and Big Data. SIMBig SIMBig 2015 2016. Communications in Computer and Information Science, vol 656. Springer, Cham. https://doi.org/10.1007/978-3-319-55209-5_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-55209-5_5

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-55208-8

  • Online ISBN: 978-3-319-55209-5

  • eBook Packages: Computer ScienceComputer Science (R0)