Abstract
Opinion mining deals with determining of the sentiment orientation—positive, negative, or neutral—of a (short) text. Recently, it has attracted great interest both in academia and in industry due to its useful potential applications. One of the most promising applications is analysis of opinions in social networks. In this paper, we examine how classifiers work while doing opinion mining over Spanish Twitter data. We explore how different settings (n-gram size, corpus size, number of sentiment classes, balanced vs. unbalanced corpus, various domains) affect precision of the machine learning algorithms. We experimented with Naïve Bayes, Decision Tree, and Support Vector Machines. We describe also language specific preprocessing—in our case, for Spanish language—of tweets. The paper presents best settings of parameters for practical applications of opinion mining in Spanish Twitter. We also present a novel resource for analysis of emotions in texts: a dictionary marked with probabilities to express one of the six basic emotions(Probability Factor of Affective use (PFA)(Spanish Emotion Lexicon that contains 2,036 words.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Pang, B., Lee, L.: Opinion Mining and Sentiment Analysis. Foundations and Trends in Information Retrieval 2, 1–135 (2008)
Liu, B.: Sentiment Analysis and Subjectivity. In: Indurkhya, N., Damerau, F.J. (eds.) Handbook of Natural Language Processing, 2nd edn. (2010)
Pang, B., Lee, L.: A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts. In: Proceedings of the ACL 2004 (2004)
Pang, B., Lee, L., Vaithyanathan, S.: Thumbs up?: sentiment classification using machine learning techniques. In: Proceedings of the ACL, pp. 79–86. Association for Computational Linguistics (2002)
Polanya, L., Zaenen, A.: Contextual Valence Shifters. Computing Attitude and Affect in Text: Theory and Applications. In: Computing Attitude and Affect in Text: Theory and Applications, vol. 20 (2006)
Wilson, T., Hoffmann, P., Somasundaran, S., Kessler, J., Wiebe, J., Choi, Y., Cardie, C., Riloff, E., Patwardhan, S.: Opinion Finder: a system for subjectivity analysis. In: Proceedings of HLT/EMNLP on Interactive Demonstrations, Vancouver, British Columbia, Canada, pp. 34–35 (2005)
Go, A., Bhayani, R., Huang, L.: Twitter sentiment classification using distant supervision. CS224N Project Report, Stanford University, Stanford, CA (2009)
Martínez Cámara, E., Martín Valdivia, M.T., Perea Ortega, J.M., Ureña López, L.A.: Técnicas de Clasificación de Opiniones Aplicadas a un Corpus en Español. Procesamiento del Lenguaje Natural, Revista 47, 163–170 (2011)
Aiala, R., Wonsever, D., Jean-Luc, M.: Opinion Identification in Spanish Texts. In: Proceedings of the NAACL HLT (2010)
Padró, L., Collado, M., Reese, S., Lloberes, M., Castellón, I.: FreeLing 2.1: Five Years of Open-Source Language Processing Tools. In: Proceedings of 7th Language Resources and Evaluation Conference, La Valletta, Malta (2010)
Das, S., Chen, M.: Yahoo! For Amazon: Extracting market sentiment from stock message boards. In: Proceedings of the 8th Asia Pacific Finance Association Annual Conference (2001)
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA Data Mining Software: An Update. SIGKDD Explorations 11(1) (2009)
EAGLES. Recommendations for the morphosyntactic annotation of corpora, Eag-tcwg-mac/r, ILC-CNR, Pisa (1996)
Esuli, A., Sebastiani, F.: SentiWN: A Publicly Available Lexical Resource for Opinion Mining. In: Fifth International Conference on Language Resources and Evaluation (LREC 2006), pp. 417–422 (2006)
Díaz-Rangel, I., Sidorov, G., Suárez-Guerra, S.: Weighted Spanish Emotion Lexicon (submitted 2012)
Cohen, J.: A coefficient of agreement for nominal scales. Educational and Psychological Measurement 20, 37–46 (1960)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Sidorov, G. et al. (2013). Empirical Study of Machine Learning Based Approach for Opinion Mining in Tweets. In: Batyrshin, I., González Mendoza, M. (eds) Advances in Artificial Intelligence. MICAI 2012. Lecture Notes in Computer Science(), vol 7629. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-37807-2_1
Download citation
DOI: https://doi.org/10.1007/978-3-642-37807-2_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-37806-5
Online ISBN: 978-3-642-37807-2
eBook Packages: Computer ScienceComputer Science (R0)