Sentiment Analysis of Online Media

  • Michael Salter-Townshend
  • Thomas Brendan Murphy
Conference paper
Part of the Studies in Classification, Data Analysis, and Knowledge Organization book series (STUDIES CLASS)


A joint model for annotation bias and document classification is presented in the context of media sentiment analysis. We consider an Irish online media data set comprising online news articles with user annotations of negative, positive or irrelevant impact on the Irish economy. The joint model combines a statistical model for user annotation bias and a Naive Bayes model for the document terms. An EM algorithm is used to estimate the annotation bias model, the unobserved biases in the user annotations, the classifier parameters and the sentiment of the articles. The joint modeling of both the user biases and the classifier is demonstrated to be superior to estimation of the bias followed by the estimation of the classifier parameters.



This work is supported by the Science Foundation Ireland under Grant No. 08/SRC/I1407: Clique: Graph & Network Analysis Cluster.


  1. Brew, A., Greene, D., & Cunningham, P. (2010a). The interaction between supervised learning and crowdsourcing. In NIPS workshop on computational social science and the wisdom of crowds, Whistler, Canada.Google Scholar
  2. Brew, A., Greene, D., & Cunningham, P. (2010b). Using crowdsourcing and active learning to track sentiment in online media. In H. Coelho, R. Studer, & M. Wooldridge (Eds.), ECAI 2010 – 19th European conference on artificial intelligence (pp. 1–11). Berlin: IOS.Google Scholar
  3. Dawid, A., & Skene, A. (1979). Maximum likelihood estimation of observer error-rates using the EM algorithm. Journal of the Royal Statistical Society. Series C (Applied Statistics), 28(1), 20–28.Google Scholar
  4. Domingos, P., & Pazzani, M. (1997). On the optimality of the simple Bayesian classifier under zero-one loss. Machine Learning, 29, 103–130.MATHCrossRefGoogle Scholar
  5. Hand, D. J., & Yu, K. (2001). Idiot’s Bayes—not so stupid after all? International Statistical Review, 69(3), 385–398.MATHCrossRefGoogle Scholar
  6. Pang, B., & Lee, L. (2008). Opinion mining and sentiment analysis. Foundations and Trends in Information Retrieval, 2(1–2), 1–135.CrossRefGoogle Scholar
  7. Raykar, V., Yu, S., Zhao, L., Valadez, G., Florin, C., Bogoni, L., & Moy, L. (2010). Learning from crowds. Journal of Machine Learning Research, 11, 1297–1322.MathSciNetGoogle Scholar
  8. Rogers, S., Girolami, M., & Polajnar, T. (2010). Semi-parametric analysis of multi-rater data. Statistics and Computing, 20, 317–334.MathSciNetCrossRefGoogle Scholar
  9. Smyth, P., Fayyad, U. M., Burl, M. C., Perona, P., & Baldi, P. (1994). Inferring ground truth from subjective labelling of venus images. In G. Tesauro, D. S. Touretzky, & T. K. Leen (Eds.), Advances in neural information processing systems (Vol. 7, pp. 1085–1092). Cambridge: MIT.Google Scholar
  10. Tetlock, P. C. (2007). Giving content to investor sentiment: The role of media in the stock market. The Journal of Finance, 62(3), 1139–1168.CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2013

Authors and Affiliations

  • Michael Salter-Townshend
    • 1
  • Thomas Brendan Murphy
    • 1
  1. 1.School of Mathematical Sciences and Complex and Adaptive Systems LaboratoryUniversity College DublinDublin 4Ireland

Personalised recommendations