Skip to main content

The Citizen IS the Journalist: Automatically Extracting News from the Swarm

Part of the Springer Proceedings in Complexity book series (SPCOM)


User generated content has become a major trend in today’s journalistic ecosystem, where in many cases news arrive on social media platforms even before they reach mainstream media. Due to today’s hyperconnected society, this type of event is becoming more frequent, and “news-like” information is being produced all over the Internet on blogs, posted on Facebook or Twitter, Wikipedia, or any other platform that allows users to share their ideas and experiences. In this chapter, we describe SwarmPulse, a system that extracts news by combing through Wikipedia and Twitter to extract newsworthy items. We measured the accuracy of SwarmPulse comparing it against the Reuters and CNN RSS feeds and the Google News feed. We found precision of 83 % and recall of 15 % against these sources.


  • News Item
  • Twitter User
  • News Channel
  • Breaking News
  • Match Strength

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

This is a preview of subscription content, access via your institution.

Buying options

USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions


  1. 1.

    The reader can try out the prototype version of SwarmPulse at


  • Bayer T, Ford H, Tar D, Romanesco (2011) Quantifying quality collaboration patterns, systemic bias, POV pushing, the impact of news events, and editors’ reputation.

  • Becker H, Naaman M, Gravano L (2011) Beyond trending topics: real-world event identification on twitter. In: Proceedings of fifth international AAAI conference on weblogs and social media, AAAI

    Google Scholar 

  • Ciglan M, Nørvåg K (2010) WikiPop: personalized event detection system based on wikipedia page view statistics. In: Proceedings of the 19th ACM international conference on information and knowledge management, ACM, New York. doi:10.1145/1871437.1871769

  • Fuehres H, Gloor PA, Henninger M, Kleeb R, Nemoto K (2012) Galaxysearch—discovering the knowledge of many by using wikipedia as a Meta-Searchindex. Paper presented at collective intelligence conference, 2012 (arXiv:1204.2991).

  • Futterer T, Gloor PA, Malhotra T, Mfula H, Packmohr K, Schultheiss S (2013) WikiPulse—a news-portal based on wikipedia. Paper presented at COINs13 conference, Chile, 2013 (arxiv:1308.1028)

    Google Scholar 

  • Iba T, Nemoto K, Peters B, Gloor PA (2010) Analyzing the creative editing behavior of wikipedia editors: through dynamic social network analysis. Procedia Soc Behav Sci 2:6441–6456. doi:

    Google Scholar 

  • Osborne M, Petrovic S, McCreadie R, Macdonald C, Ounis I (2012) Bieber no more: first story detection using Twitter and Wikipedia. In: Proceedings of the SIGIR workshop in time-aware information access. Association for Computing Machinery

    Google Scholar 

  • Petrovic S, Osborne M, McCreadie R, Macdonald C, Ounis I, Shrimpton L (2013) Can Twitter replace newswire for breaking news? In: Proceedings of the international AAAI conference on web and social media

    Google Scholar 

  • Sakaki T, Okazaki M, Matsuo Y (2010) Earthquake shakes Twitter users: real-time event detection by social sensors. In: Proceedings of the 19th international conference on world wide web

    Google Scholar 

  • Subašić I, Berendt B (2011) Peddling or creating? Investigating the role of twitter in news reporting. In: Clough P, Foley C, Gurrin C, Jones G, Kraaij W, Lee H, Mudoch V (eds) Advances in information retrieval. Springer, Berlin

    Google Scholar 

  • Wood C (2013) Wikirage.

Download references

Author information

Authors and Affiliations


Corresponding author

Correspondence to Peter A. Gloor .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

de Oliveira, J.M., Gloor, P.A. (2016). The Citizen IS the Journalist: Automatically Extracting News from the Swarm. In: Zylka, M., Fuehres, H., Fronzetti Colladon, A., Gloor, P. (eds) Designing Networks for Innovation and Improvisation. Springer Proceedings in Complexity. Springer, Cham.

Download citation

  • DOI:

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-42696-9

  • Online ISBN: 978-3-319-42697-6

  • eBook Packages: Computer ScienceComputer Science (R0)