Abstract
The web and social media have been growing exponentially in recent years. We now have access to documents bearing opinions expressed on a broad range of topics. This constitutes a rich resource for natural language processing tasks, particularly for sentiment analysis. Nevertheless, sentiment analysis is usually difficult because expressed sentiments are usually topic-oriented. In this paper, we propose to automatically construct a sentiment dictionary using relevant terms obtained from web pages for a specific domain. This dictionary is initially built by querying the web with a combination of opinion terms, as well as terms of the domain. In order to select only relevant terms we apply two measures \(\textit{AcroDef}_{\textit{MI}3}\) and TrueSkill. Experiments conducted on different domains highlight that our automatic approach performs better for specific cases.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
In this paper, we use term in order to characterize linguistic features.
- 2.
- 3.
For simplicity, in this paper, we only report experiments that have been conducted on nouns and adjectives. Other experiments have been done by using adverbs and verbs.
- 4.
- 5.
- 6.
References
Amine, A., Hamou, R.M., Simonet, M.: Detecting opinions in tweets. Int. J. Data Min. Emerg. Technol. 3(1), 23–32 (2013)
Blitzer, J., Dredze, M., Pereira, F.: Biographies, bollywood, boomboxes and blenders: domain adaptation for sentiment classification. In: Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics (ACL 2007), pp. 187–205 (2007)
Church, K.W., Hanks, P.: Word association norms, mutual information, and lexicography. Comput. Linguist. 16, 22–29 (1990)
Daille, B.: Study, implementation of combined techniques for automatic extraction of terminology. In: Klavans, J.L., Resnik, P. (eds.) The Balancing Act: Combining Statistical and Symbolic Approaches to Language, pp. 49–66. MIT Press, Cambridge (1996)
Downey, D., Broadhead, M., Etzioni, O.: Locating complex named entities in web text. In: Proceedings of the 20th International Joint Conference on Artificial Intelligence (IJCAI 2007), pp. 2733–2739 (2007)
Duthil, B., Trousset, F., Roche, M., Dray, G., Plantié, M., Montmain, J., Poncelet, P.: Locating complex named entities in web text. In: Proceedings of the 22nd International Conference on Database and Expert Systems Applications (DEXA 2011), pp. 457–465 (2007)
Guillet, F., Hamilton, H.J.: Quality Measures in Data Mining. Springer, Heidelberg (2007)
Guo, S., Sanner, S., Graepel, T., Buntine, W.: Score-based Bayesian skill learning. In: Flach, P.A., Bie, T., Cristianini, N. (eds.) ECML PKDD 2012. LNCS (LNAI), vol. 7523, pp. 106–121. Springer, Heidelberg (2012). doi:10.1007/978-3-642-33460-3_12
Harb, A., Plantie, M., Dray, G., Roche, M., Trousset, F., Poncelet, P.: Web opinion mining: how to extract opinions from blogs? In: Proceedings of the 5th International Conference on Soft Computing as Transdisciplinary Science and Technology (CSTST 2008), pp. 211–217 (2008)
Herbrich, R., Minka, T., Graepel, T.: TrueSkill(TM): a Bayesian skill rating system. In: Advances in Neural Information Processing Systems, vol. 20, pp. 569–576. MIT Press (2007)
Marrese-Taylor, E., Velásquez, J.D., Bravo-Marquez, F., Matsuo, Y.: Identifying customer preferences about tourism products using an aspect-based opinion mining approach. Procedia Comput. Sci. 22, 182–191 (2013)
Pang, B., Lee, L.: A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts. In: Proceedings of the ACL, pp. 271–278 (2004)
Pang, B., Lee, L., Vaithyanathan, S.: Thumbs up?: sentiment classification using machine learning techniques. In: Proceedings of the ACL 2002 Conference on Empirical Methods in Natural Language Processing, EMNLP 2002, vol. 10, pp. 79–86. Association for Computational Linguistics, Stroudsburg (2002)
Roche, M., Prince, V.: AcroDef: a quality measure for discriminating expansions of ambiguous acronyms. In: Kokinov, B., Richardson, D.C., Roth-Berghofer, T.R., Vieu, L. (eds.) CONTEXT 2007. LNCS (LNAI), vol. 4635, pp. 411–424. Springer, Heidelberg (2007). doi:10.1007/978-3-540-74255-5_31
Roche, M., Prince, V.: A web-mining approach to disambiguate biomedical acronym expansions. Informatica (Slovenia) 34(2), 243–253 (2010)
Turney, P.D.: Thumbs up or thumbs down?: semantic orientation applied to unsupervised classification of reviews. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, ACL 2002, pp. 417–424. Association for Computational Linguistics, Stroudsburg (2002)
Varghese, R., Jayasree, M.: Aspect based sentiment analysis using support vector machine classifier. In: 2013 International Conference on Advances in Computing, Communications and Informatics (ICACCI), pp. 1581–1586, August 2013
Vivaldi, J., Màrquez, L., Rodríguez, H.: Improving term extraction by system combination using boosting. In: Raedt, L., Flach, P. (eds.) ECML 2001. LNCS (LNAI), vol. 2167, pp. 515–526. Springer, Heidelberg (2001). doi:10.1007/3-540-44795-4_44
Wang, G., Araki, K.: Modifying so-PMI for Japanese weblog opinion mining by using a balancing factor and detecting neutral expressions. In: Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics (NAACL-Short 2007), pp. 189–192 (2007)
Acknowledgement
This work has been supported and funded by FONDECYT and SONGES project (http://textmining.biz/Projects/Songes) (FEDER and Occitanie).
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Cruz, L., Ochoa, J., Roche, M., Poncelet, P. (2017). Dictionary-Based Sentiment Analysis Applied to a Specific Domain. In: Lossio-Ventura, J., Alatrista-Salas, H. (eds) Information Management and Big Data. SIMBig SIMBig 2015 2016. Communications in Computer and Information Science, vol 656. Springer, Cham. https://doi.org/10.1007/978-3-319-55209-5_5
Download citation
DOI: https://doi.org/10.1007/978-3-319-55209-5_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-55208-8
Online ISBN: 978-3-319-55209-5
eBook Packages: Computer ScienceComputer Science (R0)