Abstract
Sentiment analysis deals with the computational detection and extraction of opinions, beliefs and emotions in written text. It combines theories and methodologies from a diverse set of scientific domains, such as psychology, natural language processing and machine learning. It fulfils the very important role of transforming the unstructured textual communication between social media users into quantifiable and informed estimations of expressed sentiment, which can subsequently be used by physicists, sociologists, complex system experts in studying the collective properties of such phenomena. The problem has been addressed from two different but often complementary directions: lexicon-based solutions that rely on sentiment dictionaries (i.e., lists of words in which each token is annotated with an indication of the affective content it typically conveys) and machine learning solutions that automatically or semi-automatically learn to detect the affective content of text. In this chapter, we discuss a range of solutions and their strengths and weaknesses in different environments and settings. We conclude that based on the application environment as well as the desired output, different types of analyses are appropriate, with varying levels of predictive accuracy.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
Most information retrieval systems and algorithms explicitly use this information in estimating the relevancy of a document in regard to a user query (Robertson et al. 2004).
- 3.
Valence and arousal are discussed in more depth in Sect. 6.2.2.
- 4.
In contrast, ‘unsupervised’ machine learning algorithms do not require annotated data.
- 5.
- 6.
- 7.
The BBC message boards are closely administered and moderated, while Digg posts aren’t.
- 8.
Studies (Bradley and Lang 1999) have shown that emotions can be characterised as “coincidence of values on a number of different strategic dimensions”, such as valence and arousal.
- 9.
- 10.
Stratification ensures that the splits for training and testing are as equal as possible for every class.
- 11.
- 12.
The extraction took place in 2009, before the website changed its focus to promotion of music.
- 13.
All the Weka .arff files for all datasets are available upon request.
- 14.
The full list of discussion threads can be found in the Appendix material as http://doi.ieeecomputersociety.org/10.1109/T-AFFC.2012.26.
- 15.
That may be due to the fact that by definition the geometric mean is less susceptible to outliers than the arithmetic mean.
- 16.
References
Ahn, J., Gobron, S., Silvestre, Q., Thalmann, D.: Asymmetrical facial expressions based on an advanced interpretation of two-dimensional russell’s emotional model. In: ENGAGE 2010, pp. 1–12 (2010)
Asur, S., Huberman, B.A.: Predicting the future with social media. In: Huang, X.J., King, I., Raghavan, V., Rueger, S. (eds.) Proceedings of the 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology, vol. 01, pp. 492–499. IEEE Computer Society, Washington (2010). doi:10.1109/WI-IAT.2010.63
Baccianella, S., Esuli, A., Sebastiani, F.: Sentiwordnet 3.0. In: Calzolari, N., Choukri, K., Maegaard, B., Mariani, J., Odijk, J., Piperidis, S., Rosner, M., Tapias, D. (eds.) Proceedings of the 7th International Conference on Language Resources and Evaluation (LREC’10), Valetta, pp. 2200–2204 (2010)
Barrett, L.F., Russell, J.A.: The structure of current affect: controversies and emerging consensus. Curr. Dir. Psychol. Sci. 8 (1), 10–14 (1999). doi:10.1111/1467-8721.00003
Bishop, C.M.: Pattern Recognition and Machine Learning. Information Science and Statistics. Springer, New York (2006)
Bradley, M.M., Lang, P.J.: Affective norms for English words (ANEW): instruction manual and affective ratings. Tech. Rep. C-1, University of Florida: Center for Research in Psychophysiology (1999)
Carvalho, P., Sarmento, L., Silva, M.J., de Oliveira, E.: Clues for detecting irony in user-generated contents: oh…!! it’s “so easy”;-). In: Jiang, M., Yu, B. (eds.) Proceedings of the 1st International CIKM Workshop on Topic-Sentiment Analysis for Mass Opinion, pp. 53–56. ACM, New York (2009). doi:10.1145/1651461.1651471
Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines (2001). Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm
Cornelius, R.R.: The Science of Emotion. Prentice Hall, Upper Saddle River (1996)
Dodds, P., Danforth, C.: Measuring the happiness of large-scale written expression: songs, blogs, and presidents. J. Happiness Stud. 11 (4), 441–456 (2010). doi:10.1007/s10902-009-9150-9
Fan, R.E., Chang, K.W., Hsieh, C.J., Wang, X.R., Lin, C.J.: LIBLINEAR: a library for large linear classification. J Mach Learn Res 9 (August), 1871–1874 (2008)
Fox, E.: Emotion Science. Palgrave Macmillan, London (2008)
González-Bailón, S., Banchs, R.E., Kaltenbrunner, A.: Emotions, public opinion, and U.S. presidential approval rates: a 5-year analysis of online political discussions. Hum. Commun. Res. 38 (2), 121–143 (2012). doi:10.1111/j.1468-2958.2011.01423.x
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: Theweka data mining software: an update. SIGKDD Explor. Newsl. 11 (1), 10–18 (2009). doi:10.1145/1656274.1656278
Jelinek, F., Merialdo, B., Roukos, S., Strauss, M.: A dynamic language model for speech recognition. In: Marcus, M.P. (ed.) Proceedings of the Workshop on Speech and Natural Language, pp. 293–295. Association for Computational Linguistics, Stroudsburg (1991). doi:10.3115/112405.112464
Jijkoun, V., de Rijke, M., Weerkamp, W. (2010) Generating focused topic-specific sentiment lexicons. In: Hajic, J., Carberry, S., Clark, S. (eds.) ACL 2010, Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pp. 585–594. Association for Computational Linguistics, Stroudsburg
Joachims, T.: Making large-scale SVM learning practical. In: Schoelkopf, B., Burges, C.J.C., Smola, A.J. (eds.) Advances in Kernel Methods: Support Vector Learning, pp. 169–184. MIT Press, Cambridge (1999)
John, G.H., Langley, P.: Estimating continuous distributions in bayesian classifiers. In: Besnard, P., Hanks, S. (eds.) Proceedings of the 11th Conference on Uncertainty in Artificial Intelligence, pp. 338–345. Morgan Kaufmann Publishers, San Francisco (1995)
Keerthi, S.S., Sundararajan, S., Chang, K.W., Hsieh, C.J., Lin, C.J.: A sequential dual method for large scale multi-class linear svms. In: Li, Y., Liu, B., Sarawagi, S. (eds.) Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 408–416. ACM, New York (2008). doi:10.1145/1401890.1401942
Kramer, A.D.: An unobtrusive behavioral model of “gross national happiness”. In: Mynatt, E., Fitzpatrick, G., Hudson, S., Edwards, K., Rodden, T. (eds.) CHI’10 Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 287–290. ACM, New York (2010). doi:10.1145/1753326.1753369
Le Cessie, S., Van Houwelingen, J.C.: Ridge estimators in logistic regression. Appl. Stat. J. R. Stat. Soc. C 41 (1), 191–201 (1992). doi:10.2307/2347628
Lee, L., Pang, B.: Seeing stars: exploiting class relationships for sentiment categorization with respect to rating scales. In: Knight, K., Ng, H.T., Oflazer, K. (eds.) ACL 2005 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, pp. 115–124. Association for Computational Linguistics, Stroudsburg (2005)
MacDonald, C., Ounis, I.: The TREC Blogs06 collection: creating and analysing a blog test collection. Tech. Rep. TR-2006-24, Department of Computer Science, University of Glasgow (2006)
MacDonald, C., Ounis, I., Soboroff, I.: Overview of the TREC 2008 Blog Track. In: The Sixteenth Text REtrieval Conference (TREC 2008) Proceedings, NIST Special Publication SP 500-277, p. 1 (2008)
Manning, C.D., Schuetze, H.: Foundations of Statistical Natural Language Processing, 1st edn. MIT Press, Cambridge (1999)
Manning, C.D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval, 1st edn. Cambridge University Press, Cambridge (2008)
Miller, G.A.: Wordnet: a lexical database for English. Commun. ACM 38 (11), 39–41 (1995). doi:10.1145/219717.219748
Mishne, G.: Experiments with mood classification in blog posts. In: Proceedings of ACM SIGIR 2005 Workshop on Stylistic Analysis of Text for Information Access (2005)
Mitchell, T.M.: Machine Learning, 1st edn. McGraw-Hill, New York (1997)
Mitrović, M., Paltoglou, G., Tadić, B.: Networks and emotion-driven user communities at popular blogs. Eur. Phys. J. B 77 (4), 597–609 (2010). doi:10.1140/epjb/e2010-00279-x
Owsley, S., Sood, S., Hammond, K.J.: Domain specific affective classification of documents. In: Computational Approaches to Analyzing Weblogs, Papers from the 2006 AAAI Spring Symposium, Technical Report SS-06-03, pp. 181–183. AAAI Press, Menlo Park (2006)
Pak, A., Paroubek, P.: Twitter as a corpus for sentiment analysis and opinion mining. In: Calzolari, N., Choukri, K., Maegaard, B., Mariani, J., Odijk, J., Piperidis, S., Rosner, M., Tapias, D. (eds.) Proceedings of the 7th International Conference on Language Resources and Evaluation (LREC’10), Valetta, pp. 1320–1326 (2010)
Paltoglou, G.: Sentiment analysis in social media. In: Agarwal, N., Wigand, R.T., Lim, M. (eds.) Online Collective Action: Dynamics of the Crowd in Social Media. Lecture Notes in Social Networks, pp. 3–18. Springer, Wien (2014). doi:10.1007/978-3-7091-1340-0_1
Paltoglou, G., Buckley, K.: Subjectivity annotation of the microblog 2011 realtime adhoc relevance judgments. In: Serdyukov, P., Braslavski, P., Kuznetsov, S.O., Kamps, J., Rüger, S., Agichtein, E., Segalovich, I., Yilmaz, E. (eds.) Advances in Information Retrieval, 35th European Conference on IR Research, ECIR 2013, Moscow, March 2013. Proceedings, pp. 344–355. Springer, Berlin/Heidelberg (2013). doi:10.1007/978-3-642-36973-5_29
Paltoglou, G., Thelwall, M.: Twitter, Myspace, Digg: unsupervised sentiment analysis in social media. ACM Trans. Intell. Syst. Technol. 3 (4), 66:1–66:19 (2012). doi:10.1145/2337542.2337551
Paltoglou, G., Thelwall, M.: Seeing stars of valence and arousal in blog posts. IEEE Trans. Affect. Comput. 4 (1), 116–123 (2013). doi:10.1109/T-AFFC.2012.36
Paltoglou, G., Thelwall, M., Buckely, K.: Online textual communication annotated with grades of emotion strength. In: Proc. EMOTION, pp. 25–31 (2010)
Paltoglou, G., Theunis, M., Kappas, A., Thelwall, M.: Predicting emotional responses to long informal text. IEEE Trans. Affect. Comput. 4 (1), 106–115 (2013). doi:10.1109/T-AFFC.2012.26
Pang, B., Lee, L.: Opinion mining and sentiment analysis. Found. Trends Inf. Retr. 2 (1–2), 1–135 (2008). doi:10.1561/1500000011
Pang, B., Lee, L., Vaithyanathan, S.: Thumbs up?: sentiment classification using machine learning techniques. In: Proceedings of the ACL-02 Conference on Empirical Methods in Natural Language Processing, EMNLP ’02, vol. 10, pp. 79–86. ACL, Stroudsburg (2002)
Pennebaker, J.W., Francis, M.E.: Linguistic Inquiry and Word Count, 1st edn. Lawrence Erlbaum, Mahwah (1999)
Platt, J.C.: Fast training of support vector machines using sequential minimal optimization. In: Schoelkopf, B., Burges, C.J.C., Smola, A.J. (eds.) Advances in Kernel Methods: Support Vector Learning, pp. 185–208. MIT Press, Cambridge (1999)
Ponomareva, N., Thelwall, M.: Do neighbours help?: an exploration of graph-based algorithms for cross-domain sentiment classification. In: Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, Association for Computational Linguistics, Jeju Island, pp. 655–665. ACL, Stroudsburg (2012)
Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann Series in Machine Learning, 1st edn. Morgan Kaufmann, San Francisco (1993)
Quirk, R., Greenbaum, S., Leech, G., Svartvik, J.: A Comprehensive Grammar of the English Language. Longman, New York (1985)
Rifkin, R., Klautau, A.: In defense of one-vs-all classification. J. Mach. Learn. Res. 5, 101–141 (2004)
Robertson, S., Zaragoza, H., Taylor, M.: Simple bm25 extension to multiple weighted fields. In: Proceedings of the 13th ACM International Conference on Information and Knowledge Management, pp. 42–49. ACM, New York (2004)
Russell, J.A.: A circumplex model of affect. J. Pers. Soc. Psychol. 39 (6), 1161–1178 (1980). doi:10.1037/h0077714
Russell, J.A.: Pancultural aspects of the human conceptual organization of emotions. J. Pers. Soc. Psychol. 45 (6), 1281–1288 (1983). doi:10.1037/0022-3514.45.6.1281
Scherer, K.R.: What are emotions? And how can they be measured? Soc. Sci. Inf. 44 (4), 695–729 (2005). doi:10.1177/0539018405058216
Schimmack, U.: Pleasure, displeasure, and mixed feelings: are semantic opposites mutually exclusive? Cognit. Emot. 15 (1), 81–97 (2001). doi:10.1080/02699930126097
Sebastiani, F.: Machine learning in automated text categorization. ACM Comput. Surv. 34 (1), 1–47 (2002). doi:10.1145/505282.505283
Shimada, K., Endo, T.: Seeing several stars: a rating inference task for a document containing several evaluation criteria. In: Washio, T., Suzuki, E., Ting, K.M., Inokuchi, A. (eds.) Advances in Knowledge Discovery and Data Mining, 12th Pacific-Asia Conference, PAKDD 2008 Osaka, May 2008 Proceedings. Lecture Notes in Computer Science (Lecture Notes in Artificial Intelligence), vol. 5012, pp. 1006–1014. Springer, Berlin/Heidelberg (2008). doi:10.1007/978-3-540-68125-0_106
Strapparava, C., Mihalcea, R.: Learning to identify emotions in text. In: Wainwright, R.L., Haddad, H. (eds.) Proceedings of the 2008 ACM Symposium on Applied Computing (SAC), pp. 1556–1560. ACM, New York (2008). doi:10.1145/1363686.1364052
Strapparava, C., Valitutti, A.: WordNet-Affect: an affective extension of WordNet. In: Lino, M.T., Xavier, M.F., Ferreira, F., Costa, R., Silva, R. (eds.) Proceedings of the 4th International Conference on Language Resources and Evaluation (LREC’04), pp. 1083–1086. European Language Resources Association, Paris (2004)
Thelwall, M., Wilkinson, D.: Public dialogs in social network sites: What is their purpose? J. Am. Soc. Inf. Sci. Technol. 61 (2), 392–404 (2010). doi:10.1002/asi.21241
Thelwall, M., Buckley, K., Paltoglou, G., Cai, D., Kappas, A.: Sentiment strength detection in short informal text. J. Am. Soc. Inf. Sci. Technol. 61 (12), 2544–2558 (2010). doi:10.1002/asi.21416
Thelwall, M., Buckley, K., Paltoglou, G.: Sentiment in twitter events. J. Am. Soc. Inf. Sci. Technol. 62 (2), 406–418 (2011). doi:10.1002/asi.21462
Whitelaw, C., Garg, N., Argamon, S.: Using appraisal groups for sentiment analysis. In: Herzog, O., Scheck, H.J., Fuhr, N., Chowdhury, A., Teiken, W. (eds.) Proceedings of the 14th ACM International Conference on Information and Knowledge Management, pp. 625–631. ACM, New York (2005). doi:10.1145/1099554.1099714
Wiebe, J.M., Bruce, R.F., O’Hara, T.P.: Development and use of a gold-standard data set for subjectivity classifications. In: Dale, R., Church, K.W. (eds.) Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics on Computational Linguistics, pp. 246–253. Association for Computational Linguistics, Stroudsburg (1999). doi:10.3115/1034678.1034721
Witten, I.H., Bell, T.C.: The zero-frequency problem: estimating the probabilities of novel events in adaptive text compression. IEEE Trans. Inf. Theory 37 (4), 1085–1094 (1991). doi:10.1109/18.87000
Acknowledgements
This work was supported by a European Union grant by the 7th Framework Programme, Theme 3: Science of complex systems for socially intelligent ICT. It is part of the CyberEmotions project (contract 231323).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Paltoglou, G., Thelwall, M. (2017). Sensing Social Media: A Range of Approaches for Sentiment Analysis. In: Holyst, J. (eds) Cyberemotions. Understanding Complex Systems. Springer, Cham. https://doi.org/10.1007/978-3-319-43639-5_6
Download citation
DOI: https://doi.org/10.1007/978-3-319-43639-5_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-43637-1
Online ISBN: 978-3-319-43639-5
eBook Packages: Physics and AstronomyPhysics and Astronomy (R0)