Advertisement

Sensing Social Media: A Range of Approaches for Sentiment Analysis

  • Georgios Paltoglou
  • Mike Thelwall
Part of the Understanding Complex Systems book series (UCS)

Abstract

Sentiment analysis deals with the computational detection and extraction of opinions, beliefs and emotions in written text. It combines theories and methodologies from a diverse set of scientific domains, such as psychology, natural language processing and machine learning. It fulfils the very important role of transforming the unstructured textual communication between social media users into quantifiable and informed estimations of expressed sentiment, which can subsequently be used by physicists, sociologists, complex system experts in studying the collective properties of such phenomena. The problem has been addressed from two different but often complementary directions: lexicon-based solutions that rely on sentiment dictionaries (i.e., lists of words in which each token is annotated with an indication of the affective content it typically conveys) and machine learning solutions that automatically or semi-automatically learn to detect the affective content of text. In this chapter, we discuss a range of solutions and their strengths and weaknesses in different environments and settings. We conclude that based on the application environment as well as the desired output, different types of analyses are appropriate, with varying levels of predictive accuracy.

Keywords

Language Model Sentiment Analysis Text Segment Private State Affective Content 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Notes

Acknowledgements

This work was supported by a European Union grant by the 7th Framework Programme, Theme 3: Science of complex systems for socially intelligent ICT. It is part of the CyberEmotions project (contract 231323).

References

  1. Ahn, J., Gobron, S., Silvestre, Q., Thalmann, D.: Asymmetrical facial expressions based on an advanced interpretation of two-dimensional russell’s emotional model. In: ENGAGE 2010, pp. 1–12 (2010)Google Scholar
  2. Asur, S., Huberman, B.A.: Predicting the future with social media. In: Huang, X.J., King, I., Raghavan, V., Rueger, S. (eds.) Proceedings of the 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology, vol. 01, pp. 492–499. IEEE Computer Society, Washington (2010). doi:10.1109/WI-IAT.2010.63CrossRefGoogle Scholar
  3. Baccianella, S., Esuli, A., Sebastiani, F.: Sentiwordnet 3.0. In: Calzolari, N., Choukri, K., Maegaard, B., Mariani, J., Odijk, J., Piperidis, S., Rosner, M., Tapias, D. (eds.) Proceedings of the 7th International Conference on Language Resources and Evaluation (LREC’10), Valetta, pp. 2200–2204 (2010)Google Scholar
  4. Barrett, L.F., Russell, J.A.: The structure of current affect: controversies and emerging consensus. Curr. Dir. Psychol. Sci. 8 (1), 10–14 (1999). doi:10.1111/1467-8721.00003CrossRefGoogle Scholar
  5. Bishop, C.M.: Pattern Recognition and Machine Learning. Information Science and Statistics. Springer, New York (2006)MATHGoogle Scholar
  6. Bradley, M.M., Lang, P.J.: Affective norms for English words (ANEW): instruction manual and affective ratings. Tech. Rep. C-1, University of Florida: Center for Research in Psychophysiology (1999)Google Scholar
  7. Carvalho, P., Sarmento, L., Silva, M.J., de Oliveira, E.: Clues for detecting irony in user-generated contents: oh…!! it’s “so easy”;-). In: Jiang, M., Yu, B. (eds.) Proceedings of the 1st International CIKM Workshop on Topic-Sentiment Analysis for Mass Opinion, pp. 53–56. ACM, New York (2009). doi:10.1145/1651461.1651471CrossRefGoogle Scholar
  8. Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines (2001). Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm
  9. Cornelius, R.R.: The Science of Emotion. Prentice Hall, Upper Saddle River (1996)Google Scholar
  10. Dodds, P., Danforth, C.: Measuring the happiness of large-scale written expression: songs, blogs, and presidents. J. Happiness Stud. 11 (4), 441–456 (2010). doi:10.1007/s10902-009-9150-9CrossRefGoogle Scholar
  11. Fan, R.E., Chang, K.W., Hsieh, C.J., Wang, X.R., Lin, C.J.: LIBLINEAR: a library for large linear classification. J Mach Learn Res 9 (August), 1871–1874 (2008)MATHGoogle Scholar
  12. Fox, E.: Emotion Science. Palgrave Macmillan, London (2008)Google Scholar
  13. González-Bailón, S., Banchs, R.E., Kaltenbrunner, A.: Emotions, public opinion, and U.S. presidential approval rates: a 5-year analysis of online political discussions. Hum. Commun. Res. 38 (2), 121–143 (2012). doi:10.1111/j.1468-2958.2011.01423.xGoogle Scholar
  14. Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: Theweka data mining software: an update. SIGKDD Explor. Newsl. 11 (1), 10–18 (2009). doi:10.1145/1656274.1656278CrossRefGoogle Scholar
  15. Jelinek, F., Merialdo, B., Roukos, S., Strauss, M.: A dynamic language model for speech recognition. In: Marcus, M.P. (ed.) Proceedings of the Workshop on Speech and Natural Language, pp. 293–295. Association for Computational Linguistics, Stroudsburg (1991). doi:10.3115/112405.112464CrossRefGoogle Scholar
  16. Jijkoun, V., de Rijke, M., Weerkamp, W. (2010) Generating focused topic-specific sentiment lexicons. In: Hajic, J., Carberry, S., Clark, S. (eds.) ACL 2010, Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pp. 585–594. Association for Computational Linguistics, StroudsburgGoogle Scholar
  17. Joachims, T.: Making large-scale SVM learning practical. In: Schoelkopf, B., Burges, C.J.C., Smola, A.J. (eds.) Advances in Kernel Methods: Support Vector Learning, pp. 169–184. MIT Press, Cambridge (1999)Google Scholar
  18. John, G.H., Langley, P.: Estimating continuous distributions in bayesian classifiers. In: Besnard, P., Hanks, S. (eds.) Proceedings of the 11th Conference on Uncertainty in Artificial Intelligence, pp. 338–345. Morgan Kaufmann Publishers, San Francisco (1995)Google Scholar
  19. Keerthi, S.S., Sundararajan, S., Chang, K.W., Hsieh, C.J., Lin, C.J.: A sequential dual method for large scale multi-class linear svms. In: Li, Y., Liu, B., Sarawagi, S. (eds.) Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 408–416. ACM, New York (2008). doi:10.1145/1401890.1401942CrossRefGoogle Scholar
  20. Kramer, A.D.: An unobtrusive behavioral model of “gross national happiness”. In: Mynatt, E., Fitzpatrick, G., Hudson, S., Edwards, K., Rodden, T. (eds.) CHI’10 Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 287–290. ACM, New York (2010). doi:10.1145/1753326.1753369Google Scholar
  21. Le Cessie, S., Van Houwelingen, J.C.: Ridge estimators in logistic regression. Appl. Stat. J. R. Stat. Soc. C 41 (1), 191–201 (1992). doi:10.2307/2347628MATHGoogle Scholar
  22. Lee, L., Pang, B.: Seeing stars: exploiting class relationships for sentiment categorization with respect to rating scales. In: Knight, K., Ng, H.T., Oflazer, K. (eds.) ACL 2005 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, pp. 115–124. Association for Computational Linguistics, Stroudsburg (2005)Google Scholar
  23. MacDonald, C., Ounis, I.: The TREC Blogs06 collection: creating and analysing a blog test collection. Tech. Rep. TR-2006-24, Department of Computer Science, University of Glasgow (2006)Google Scholar
  24. MacDonald, C., Ounis, I., Soboroff, I.: Overview of the TREC 2008 Blog Track. In: The Sixteenth Text REtrieval Conference (TREC 2008) Proceedings, NIST Special Publication SP 500-277, p. 1 (2008)Google Scholar
  25. Manning, C.D., Schuetze, H.: Foundations of Statistical Natural Language Processing, 1st edn. MIT Press, Cambridge (1999)Google Scholar
  26. Manning, C.D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval, 1st edn. Cambridge University Press, Cambridge (2008)CrossRefMATHGoogle Scholar
  27. Miller, G.A.: Wordnet: a lexical database for English. Commun. ACM 38 (11), 39–41 (1995). doi:10.1145/219717.219748CrossRefGoogle Scholar
  28. Mishne, G.: Experiments with mood classification in blog posts. In: Proceedings of ACM SIGIR 2005 Workshop on Stylistic Analysis of Text for Information Access (2005)Google Scholar
  29. Mitchell, T.M.: Machine Learning, 1st edn. McGraw-Hill, New York (1997)MATHGoogle Scholar
  30. Mitrović, M., Paltoglou, G., Tadić, B.: Networks and emotion-driven user communities at popular blogs. Eur. Phys. J. B 77 (4), 597–609 (2010). doi:10.1140/epjb/e2010-00279-xADSCrossRefGoogle Scholar
  31. Owsley, S., Sood, S., Hammond, K.J.: Domain specific affective classification of documents. In: Computational Approaches to Analyzing Weblogs, Papers from the 2006 AAAI Spring Symposium, Technical Report SS-06-03, pp. 181–183. AAAI Press, Menlo Park (2006)Google Scholar
  32. Pak, A., Paroubek, P.: Twitter as a corpus for sentiment analysis and opinion mining. In: Calzolari, N., Choukri, K., Maegaard, B., Mariani, J., Odijk, J., Piperidis, S., Rosner, M., Tapias, D. (eds.) Proceedings of the 7th International Conference on Language Resources and Evaluation (LREC’10), Valetta, pp. 1320–1326 (2010)Google Scholar
  33. Paltoglou, G.: Sentiment analysis in social media. In: Agarwal, N., Wigand, R.T., Lim, M. (eds.) Online Collective Action: Dynamics of the Crowd in Social Media. Lecture Notes in Social Networks, pp. 3–18. Springer, Wien (2014). doi:10.1007/978-3-7091-1340-0_1Google Scholar
  34. Paltoglou, G., Buckley, K.: Subjectivity annotation of the microblog 2011 realtime adhoc relevance judgments. In: Serdyukov, P., Braslavski, P., Kuznetsov, S.O., Kamps, J., Rüger, S., Agichtein, E., Segalovich, I., Yilmaz, E. (eds.) Advances in Information Retrieval, 35th European Conference on IR Research, ECIR 2013, Moscow, March 2013. Proceedings, pp. 344–355. Springer, Berlin/Heidelberg (2013). doi:10.1007/978-3-642-36973-5_29Google Scholar
  35. Paltoglou, G., Thelwall, M.: Twitter, Myspace, Digg: unsupervised sentiment analysis in social media. ACM Trans. Intell. Syst. Technol. 3 (4), 66:1–66:19 (2012). doi:10.1145/2337542.2337551Google Scholar
  36. Paltoglou, G., Thelwall, M.: Seeing stars of valence and arousal in blog posts. IEEE Trans. Affect. Comput. 4 (1), 116–123 (2013). doi:10.1109/T-AFFC.2012.36CrossRefGoogle Scholar
  37. Paltoglou, G., Thelwall, M., Buckely, K.: Online textual communication annotated with grades of emotion strength. In: Proc. EMOTION, pp. 25–31 (2010)Google Scholar
  38. Paltoglou, G., Theunis, M., Kappas, A., Thelwall, M.: Predicting emotional responses to long informal text. IEEE Trans. Affect. Comput. 4 (1), 106–115 (2013). doi:10.1109/T-AFFC.2012.26CrossRefGoogle Scholar
  39. Pang, B., Lee, L.: Opinion mining and sentiment analysis. Found. Trends Inf. Retr. 2 (1–2), 1–135 (2008). doi:10.1561/1500000011CrossRefGoogle Scholar
  40. Pang, B., Lee, L., Vaithyanathan, S.: Thumbs up?: sentiment classification using machine learning techniques. In: Proceedings of the ACL-02 Conference on Empirical Methods in Natural Language Processing, EMNLP ’02, vol. 10, pp. 79–86. ACL, Stroudsburg (2002)Google Scholar
  41. Pennebaker, J.W., Francis, M.E.: Linguistic Inquiry and Word Count, 1st edn. Lawrence Erlbaum, Mahwah (1999)Google Scholar
  42. Platt, J.C.: Fast training of support vector machines using sequential minimal optimization. In: Schoelkopf, B., Burges, C.J.C., Smola, A.J. (eds.) Advances in Kernel Methods: Support Vector Learning, pp. 185–208. MIT Press, Cambridge (1999)Google Scholar
  43. Ponomareva, N., Thelwall, M.: Do neighbours help?: an exploration of graph-based algorithms for cross-domain sentiment classification. In: Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, Association for Computational Linguistics, Jeju Island, pp. 655–665. ACL, Stroudsburg (2012)Google Scholar
  44. Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann Series in Machine Learning, 1st edn. Morgan Kaufmann, San Francisco (1993)Google Scholar
  45. Quirk, R., Greenbaum, S., Leech, G., Svartvik, J.: A Comprehensive Grammar of the English Language. Longman, New York (1985)Google Scholar
  46. Rifkin, R., Klautau, A.: In defense of one-vs-all classification. J. Mach. Learn. Res. 5, 101–141 (2004)MathSciNetMATHGoogle Scholar
  47. Robertson, S., Zaragoza, H., Taylor, M.: Simple bm25 extension to multiple weighted fields. In: Proceedings of the 13th ACM International Conference on Information and Knowledge Management, pp. 42–49. ACM, New York (2004)Google Scholar
  48. Russell, J.A.: A circumplex model of affect. J. Pers. Soc. Psychol. 39 (6), 1161–1178 (1980). doi:10.1037/h0077714CrossRefGoogle Scholar
  49. Russell, J.A.: Pancultural aspects of the human conceptual organization of emotions. J. Pers. Soc. Psychol. 45 (6), 1281–1288 (1983). doi:10.1037/0022-3514.45.6.1281CrossRefGoogle Scholar
  50. Scherer, K.R.: What are emotions? And how can they be measured? Soc. Sci. Inf. 44 (4), 695–729 (2005). doi:10.1177/0539018405058216CrossRefGoogle Scholar
  51. Schimmack, U.: Pleasure, displeasure, and mixed feelings: are semantic opposites mutually exclusive? Cognit. Emot. 15 (1), 81–97 (2001). doi:10.1080/02699930126097CrossRefGoogle Scholar
  52. Sebastiani, F.: Machine learning in automated text categorization. ACM Comput. Surv. 34 (1), 1–47 (2002). doi:10.1145/505282.505283CrossRefGoogle Scholar
  53. Shimada, K., Endo, T.: Seeing several stars: a rating inference task for a document containing several evaluation criteria. In: Washio, T., Suzuki, E., Ting, K.M., Inokuchi, A. (eds.) Advances in Knowledge Discovery and Data Mining, 12th Pacific-Asia Conference, PAKDD 2008 Osaka, May 2008 Proceedings. Lecture Notes in Computer Science (Lecture Notes in Artificial Intelligence), vol. 5012, pp. 1006–1014. Springer, Berlin/Heidelberg (2008). doi:10.1007/978-3-540-68125-0_106Google Scholar
  54. Strapparava, C., Mihalcea, R.: Learning to identify emotions in text. In: Wainwright, R.L., Haddad, H. (eds.) Proceedings of the 2008 ACM Symposium on Applied Computing (SAC), pp. 1556–1560. ACM, New York (2008). doi:10.1145/1363686.1364052CrossRefGoogle Scholar
  55. Strapparava, C., Valitutti, A.: WordNet-Affect: an affective extension of WordNet. In: Lino, M.T., Xavier, M.F., Ferreira, F., Costa, R., Silva, R. (eds.) Proceedings of the 4th International Conference on Language Resources and Evaluation (LREC’04), pp. 1083–1086. European Language Resources Association, Paris (2004)Google Scholar
  56. Thelwall, M., Wilkinson, D.: Public dialogs in social network sites: What is their purpose? J. Am. Soc. Inf. Sci. Technol. 61 (2), 392–404 (2010). doi:10.1002/asi.21241Google Scholar
  57. Thelwall, M., Buckley, K., Paltoglou, G., Cai, D., Kappas, A.: Sentiment strength detection in short informal text. J. Am. Soc. Inf. Sci. Technol. 61 (12), 2544–2558 (2010). doi:10.1002/asi.21416CrossRefGoogle Scholar
  58. Thelwall, M., Buckley, K., Paltoglou, G.: Sentiment in twitter events. J. Am. Soc. Inf. Sci. Technol. 62 (2), 406–418 (2011). doi:10.1002/asi.21462CrossRefGoogle Scholar
  59. Whitelaw, C., Garg, N., Argamon, S.: Using appraisal groups for sentiment analysis. In: Herzog, O., Scheck, H.J., Fuhr, N., Chowdhury, A., Teiken, W. (eds.) Proceedings of the 14th ACM International Conference on Information and Knowledge Management, pp. 625–631. ACM, New York (2005). doi:10.1145/1099554.1099714Google Scholar
  60. Wiebe, J.M., Bruce, R.F., O’Hara, T.P.: Development and use of a gold-standard data set for subjectivity classifications. In: Dale, R., Church, K.W. (eds.) Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics on Computational Linguistics, pp. 246–253. Association for Computational Linguistics, Stroudsburg (1999). doi:10.3115/1034678.1034721CrossRefGoogle Scholar
  61. Witten, I.H., Bell, T.C.: The zero-frequency problem: estimating the probabilities of novel events in adaptive text compression. IEEE Trans. Inf. Theory 37 (4), 1085–1094 (1991). doi:10.1109/18.87000CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2017

Authors and Affiliations

  1. 1.School of Mathematics and Computer ScienceUniversity of WolverhamptonWolverhamptonUK

Personalised recommendations