Skip to main content

Bloggers’ Responses to the Snowden Affair: Combining Automated and Manual Methods in the Analysis of News Blogging


The Snowden affair gave rise to a huge public debate about not only the legitimacy of the secret surveillance programs he revealed but also about Snowden himself and about the accuracy of the information he leaked. In this paper we present an analysis of how the affair was discussed in the English language blogosphere, based on a corpus of 15,000 blog posts written about Snowden and published from June 2013 to June 2014, as a sub-corpus of a larger corpus of 100,000 blog posts on the topic of surveillance, written during the period 2006–2014. Automated tools are used to identify the topics that characterize the blogging about surveillance and the posts about the Snowden affair. Through an in-depth analysis of the blog posts that commented on Snowden’s revelations of the PRISM program for surveillance of social media users, we chart how bloggers responded to Snowden and his role in this disclosure, whether they found the information credible, and the extent to which they expressed criticism of the surveillance practices. The analysis is used as a basis for discussing the role of blogs in the civic engagement during the first phase of the Snowden affair.

This is a preview of subscription content, access via your institution.

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5


  1. 1.

  2. 2.

  3. 3.

    “Greenwald,” the name of the leading journalist working on Snowden for The Guardian, is mentioned.

  4. 4.

    Performed via the WordSmith software for corpus analysis, see

  5. 5.

    Using the VOSON crawler from


  1. Bacharach, M. and D. Gambetta (2001). Trust in Signs, in K. Cook (eds): Trust in Society. New York: Russell Sage.

    Google Scholar 

  2. Blei, D. M. and John D. Lafferty (2009). Topic models. A. N. Srivastava and M. Sahami (eds): Text mining: classification, clustering, and applications. London: Chapman and Hall, pp. 71–89.

    Google Scholar 

  3. Blei, David M., A. Y. Ng, and M. I. Jordan (2003). Latent dirichlet allocation. Journal of machine Learning research, vol. 3, pp. 993–1022.

    MATH  Google Scholar 

  4. Branum, J. and J. Charteris-Black (2015). The Edward Snowden Affair: A corpus study of the British press. Discourse and Communication, vol. 9, no. 2, pp. 1–22

    Article  Google Scholar 

  5. Bruns, A. (eds) (2005). Gatewatching. Collaborative Online News Production. New York: Peter Lang.

    Google Scholar 

  6. Bruns, A. (2007). Methodologies for Mapping the Political Blogosphere: An Exploration Using the IssueCrawler Research Tool. First Monday. Accessed 23 June 2015.

  7. Bruns, A. and J. Jacobs (eds) (2006). The Uses of Blogs. New York: Peter Lang.

    Google Scholar 

  8. Chadwick, A. and B. Collister (2014). Boundary-Drawing Power and the Renewal of Professional News Organizations: The Case of The Guardian and the Snowden National Security Agency Leak. International Journal of Communication, vol. 8, pp. 2420–2441.

    Google Scholar 

  9. Couldry, N., S. Livingstone and T. Markham (2007). Media Consumption and Public Engagement. Basingstoke: Palgrave.

    Google Scholar 

  10. Dhillon, I.S. and D.S. Modha (2001). Concept Decompositions for Large Sparse Text Data Using Clustering. Machine Learning, vol. 42, no. 1, pp. 143–175.

    Article  MATH  Google Scholar 

  11. Duns J. (2015). News of Devils. The media and Edward Snowden. CreateSpace Independent Publishing Platform.

  12. Edward Snowden: the whistleblower behind the NSA surveillance revelations (2013, June 11). The Guardian. Accessed: March 21, 2015.

  13. Elster, Jon (2007). Explaining Human Behavior. Cambridge: Cambridge University Press.

    Book  Google Scholar 

  14. Fleiss, J.L., B. Levin and M.C. Paik (2003): Statistical methods for rates and proportions, 3rd ed. Hoboken: Wiley.

    Book  MATH  Google Scholar 

  15. Gambetta, D. and H. Hamill (2005). Streetwise. How Taxi Drivers Establish Their Customer’s Trustworthiness. New York: Russell Sage.

    Google Scholar 

  16. Greenwald, G. (2014). No Place to Hide. Edward Snowden, the NSA and the Surveillance State. London: Hamish Hamilton.

    Google Scholar 

  17. Hornik, K., I. Feinerer, M. Kober and C. Buchta (2012). Spherical k-Means Clustering. Journal of Statistical Software, vol. 50, no. 10, 1–22.

    Article  Google Scholar 

  18. Karpf, D. (2008). Understanding Blog Space. Journal of Information Technology and Politics. Vol. 5, no. 4, pp. 369–385.

    Article  Google Scholar 

  19. Leccese, M. (2009). Online Information Sources of Political Blogs. Journalism and Mass Communication Quarterly. vol. 86, no. 3, pp. 578–593.

    Article  Google Scholar 

  20. Manning, C. D., P. Raghavan and H. Schütze (2008). Introduction to Information Retrieval. Cambridge: Cambridge University Press.

    Book  MATH  Google Scholar 

  21. Meinel, C., J. Bross, P. Bergen and P. Henning (2015). Blogosphere and its Exploration. New York: Springer.

    Book  Google Scholar 

  22. Moe, H. (2011). Mapping the Norwegian Blogosphere: Methodological Challenges in Internationalizing Internet Research. Social Science Computer Review. Vol. 29, no. 3, pp. 313–326

    Article  Google Scholar 

  23. Rasmussen, Eric (2006). Games and Information. An Introduction to Game Theory. London: Blackwell.

    Google Scholar 

  24. PEW Research Center (2013). Public Split over Impact of NSA Leak, But Most Want Snowden Persecuted. Last visited: March 21, 2015.

  25. Rettberg, J. W. (2008). Blogging. Cambridge: Polity.

    Google Scholar 

  26. Rogers, R. (2013). Digital Methods. Cambridge: MIT Press.

    Google Scholar 

  27. Rousseeuw, P. J. (1987). Silhouettes: a Graphical Aid to the Interpretation and Validation of Cluster Analysis. Computational and Applied Mathematics, vol. 20, pp. 53–65. doi:10.1016/0377-0427(87)90125-7.

    Article  MATH  Google Scholar 

  28. van Dijck, J. (2014). Datafication, dataism and dataveillance: Big data between scientific paradigm and ideology. Surveillance & Society, vol. 12, no. 2, pp. 197–208.

    Google Scholar 

  29. Wemple, E. (2013). Leaker, Source or Whistleblower. Washington Post Accessed 21 March 2015.

Download references


This research was supported by a grant from the Research Council of Norway’s VERDIKT program (NTAP, project 213401). We are very grateful to Knut Hofland and Andrew Salway for their role in creating the corpus analyzed here.

Author information



Corresponding author

Correspondence to Dag Elgesem.



Table 3

Table 3 Topics in the 15,000 blogs about Snowden.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Elgesem, D., Feinerer, I. & Steskal, L. Bloggers’ Responses to the Snowden Affair: Combining Automated and Manual Methods in the Analysis of News Blogging. Comput Supported Coop Work 25, 167–191 (2016).

Download citation


  • Blog research
  • Blogs
  • Cluster analysis
  • Social media
  • Surveillance
  • Topic analysis
  • Trust