Comparing Algorithms for Microblog Summarisation

  • Stuart Mackie
  • Richard McCreadie
  • Craig Macdonald
  • Iadh Ounis
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8685)

Abstract

Event detection and tracking using social media and user-generated content has received a lot of attention from the research community in recent years, since such sources can purportedly provide up-to-date information about events as they evolve, e.g. earthquakes. Concisely reporting (summarising) events for users/emergency services using information obtained from social media sources like Twitter is not a solved problem. Current systems either directly apply, or build upon, classical summarisation approaches previously shown to be effective within the newswire domain. However, to-date, research into how well these approaches generalise from the newswire to the microblog domain is limited. Hence, in this paper, we compare the performance of eleven summarisation approaches using four microblog summarisation datasets, with the aim of determining which are the most effective and therefore should be used as baselines in future research. Our results indicate that the SumBasic algorithm and Centroid-based summarisation with redundancy reduction are the most effective approaches, across the four datasets and five automatic summarisation evaluation measures tested.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Amati, G., Amodeo, G., Bianchi, M., Marcone, G., Bordoni, F.U., Gaibisso, C., Gambosi, G., Celi, A., Di Nicola, C., Flammini, M.: FUB, IASI-CNR, UNIVAQ at TREC 2011 Microblog Track. In: Proc. of TREC 2011 (2011)Google Scholar
  2. 2.
    Kwak, H., Lee, C., Park, H., Moon, S.: What is Twitter, a Social Network or a News Media? In: Proc. of WWW 2010 (2010)Google Scholar
  3. 3.
    Lin, C.Y.: ROUGE: a Package for Automatic Evaluation of Summaries. In: Proc. of ACL 2004 (2004)Google Scholar
  4. 4.
    Lin, C.Y., Hovy, E.: The automated acquisition of topic signatures for text summarization. In: Proc. of ACL 2000 (2000)Google Scholar
  5. 5.
    Lin, C.Y., Hovy, E.: Automatic Evaluation of Summaries using N-gram Co-occurrence Statistics. In: Proc. of NAACL-HLT 2003 (2003)Google Scholar
  6. 6.
    Lin, J.: Divergence Measures based on the Shannon Entropy. IEEE Transactions on Information Theory 37(1) (1991)Google Scholar
  7. 7.
    Louis, A., Nenkova, A.: Automatically Assessing Machine Summary Content without a Gold Standard. Computational Linguistics 39(2) (2013)Google Scholar
  8. 8.
    McCreadie, R., Soboroff, I., Lin, J., Macdonald, C., Ounis, I., McCullough, D.: On Building a Reusable Twitter Corpus. In: Proc. of SIGIR 2012 (2012)Google Scholar
  9. 9.
    Nenkova, A., McKeown, K.: Automatic Summarization. Foundations and Trends in Information Retrieval 5(2-3) (2011)Google Scholar
  10. 10.
    Nenkova, A., Vanderwende, L.: The Impact of Frequency on Summarization. MSR-TR-2005-101 (2005)Google Scholar
  11. 11.
    Rosa, K.D., Shah, R., Lin, B., Gershman, A., Frederking, R.: Topical Clustering of Tweets (2011)Google Scholar
  12. 12.
    Sharifi, B.P., Inouye, D.I., Kalita, J.K.: Summarization of Twitter Microblogs. The Computer Journal (2013)Google Scholar
  13. 13.
    Spärck Jones, K.: Automatic Summarizing: Factors and Directions. In: Advances in Automatic Text Summarization (1999)Google Scholar
  14. 14.
    Teevan, J., Ramage, D., Morris, M.R.: #TwitterSearch: a Comparison of Microblog Search and Web search. In: Proc. of WSDM 2011 (2011)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Stuart Mackie
    • 1
  • Richard McCreadie
    • 1
  • Craig Macdonald
    • 1
  • Iadh Ounis
    • 1
  1. 1.School of Computing ScienceUniversity of GlasgowUK

Personalised recommendations