Skip to main content

Summarizing Microblogs During Emergency Events: A Comparison of Extractive Summarization Algorithms

  • Conference paper
  • First Online:
Emerging Technologies in Data Mining and Information Security

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 813))

Abstract

Microblogging sites, notably Twitter, have become important sources of real-time situational information during emergency events. Since hundreds to thousands of microblogs (tweets) are generally posted on Twitter during an emergency event, manually going through every tweet is not feasible. Hence, summarization of microblogs posted during emergency events has become an important problem in recent years. Several summarization algorithms have been proposed in the literature, both for general document summarization, as well as specifically for summarization of microblogs. However, to our knowledge, there has not been any systematic analysis on which algorithms are more suitable for summarization of microblogs posted during disasters. In this work, we evaluate and compare the performance of 8 extractive summarization algorithms in the application of summarizing microblogs posted during emergency events. Apart from comparing the performances of the algorithms, we also find significant differences among the summaries produced by different algorithms over the same input data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 259.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 329.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Availability of implementations: Frequency Summarizer (http://glowingpython.blogspot.in/2014/09/text-summarization-with-nltk.html), Mead (http://www.summarization.com/mead/), SumBasic (https://github.com/EthanMacdonald/SumBasic), LexRank, LSA and LUHN are available as part of the Python Sumy package (https://pypi.python.org/pypi/sumy). COWTS (proposed in our prior work [5]) and ClusterRank were implemented by us.

References

  1. Imran, M., Castillo, C., Diaz, F., Vieweg, S.: Processing social media messages in mass emergency: a survey. ACM Comput. Surv. 47(4), 67:1–67:38 (2015)

    Google Scholar 

  2. Das, D., Martins, A.F.: A survey on automatic text summarization. Lit. Surv. Lang. Stat. II Course CMU 4, 192–195 (2007)

    Google Scholar 

  3. Gupta, V., Lehal, G.S.: A survey of text summarization extractive techniques. IEEE J. Emerg. Technol. Web Intell. 2(3), 258–268 (2010)

    Google Scholar 

  4. Luhn, H.P.: The automatic creation of literature abstracts. IBM J. Res. Dev. 2(2), 159–165 (1958)

    Article  MathSciNet  Google Scholar 

  5. Rudra, K., Ghosh, S., Goyal, P., Ganguly, N., Ghosh, S.: Extracting situational information from microblogs during disaster events: a classification-summarization approach. In: Proceedings of ACM CIKM (2015)

    Google Scholar 

  6. Olariu, A.: Efficient online summarization of microblogging streams. In: Proceedings of EACL(short paper), pp. 236–240 (2014)

    Google Scholar 

  7. Shou, L., Wang, Z., Chen, K., Chen, G.: Sumblr: continuous summarization of evolving tweet streams. In: Proceedings of ACM SIGIR (2013)

    Google Scholar 

  8. Wang, Z., Shou, L., Chen, K., Chen, G., Mehrotra, S.: On summarization and timeline generation for evolutionary tweet streams. IEEE Trans. Knowl. Data Eng. 27, 1301–1314 (2015)

    Article  Google Scholar 

  9. Zubiaga, A., Spina, D., Amigo, E., Gonzalo, J.: Towards real-time summarization of scheduled events from twitter streams. In: Hypertext(Poster) (2012)

    Google Scholar 

  10. Erkan, G., Radev, D.R.: LexRank: Graph-Based Lexical Centrality as Salience in Text Summarization, pp. 457–479 (2004)

    Google Scholar 

  11. Dutta, S., Ghatak, S., Roy, M., Ghosh, S., Das, A.K.: A graph based clustering technique for tweet summarization. In: 2015 4th International Conference on Reliability, Infocom Technologies and Optimization (ICRITO)(Trends and Future Directions), pp. 1–6. IEEE (2015)

    Google Scholar 

  12. Xu, W., Grishman, R., Meyers, A., Ritter, A.: A preliminary study of tweet summarization using information extraction. In: Proceedings of NAACL 2013, 20 (2013)

    Google Scholar 

  13. Chakrabarti, D., Punera, K.: Event summarization using tweets. In: Proceedings of AAAI ICWSM, pp. 340–348 (2011)

    Google Scholar 

  14. Gillani, M., Ilyas, M.U., Saleh, S., Alowibdi, J.S., Aljohani, N., Alotaibi, F.S.: Post summarization of microblogs of sporting events. In: Proceedings of International Conference on World Wide Web (WWW) Companion, pp. 59–68 (2017)

    Google Scholar 

  15. Khan, M.A.H., Bollegala, D., Liu, G., Sezaki, K.: Multi-tweet summarization of real-time events. In: Socialcom (2013)

    Google Scholar 

  16. Nichols, J., Mahmud, J., Drews, C.: Summarizing sporting events using twitter. In: Proceedings of ACM International Conference on Intelligent User Interfaces (IUI), pp. 189–198 (2012)

    Google Scholar 

  17. Takamura, H., Yokono, H., Okumura, M.: Summarizing a document stream. In: Proceedings of ECIR (2011)

    Google Scholar 

  18. Osborne, M., Moran, S., McCreadie, R., Lunen, A.V., Sykora, M., Cano, E., Ireson, N., Macdonald, C., Ounis, I., He, Y., Jackson, T., Ciravegna, F., OBrien, A.: Real-time detection, tracking, and monitoring of automatically discovered events in social media. In: Proceedings of ACL (2014)

    Google Scholar 

  19. Kedzie, C., McKeown, K., Diaz, F.: Predicting salient updates for disaster summarization. In: Proceedings of ACL (2015)

    Google Scholar 

  20. Nguyen, M.T., Kitamoto, A., Nguyen, T.T.: Tsum4act: a framework for retrieving and summarizing actionable tweets during a disaster for reaction. In: Proceedings of PAKDD (2015)

    Google Scholar 

  21. Inouye, D.I., Kalita, J.K.: Comparing twitter summarization algorithms for multiple post summaries. In: Proceedings of IEEE SocialCom/PASSAT, pp. 298–306 (2011)

    Google Scholar 

  22. Mackie, S., McCreadie, R., Macdonald, C., Ounis, I.: Comparing algorithms for microblog summarisation. In: Proceedings of CLEF (2014)

    Google Scholar 

  23. Rosa, K.D., Shah, R., Lin, B., Gershman, A., Frederking, R.: Topical Clustering of Tweets

    Google Scholar 

  24. Garg, N., Favre, B., Riedhammer, K., Hakkani-Tr, D.: Clusterrank: a graph based method for meeting summarization. In: INTERSPEECH, pp. 1499–1502. ISCA (2009)

    Google Scholar 

  25. Erkan, G., Radev, D.R.: Lexrank: Graph-based lexical centrality as salience in text summarization. J. Artif. Int. Res. 22(1), 457–479 (2004)

    Google Scholar 

  26. Gong, Y., Liu, X.: Generic text summarization using relevance measure and latent semantic analysis. In: SIGIR, pp. 19–25 (2001)

    Google Scholar 

  27. Ozsoy, M.G., Alpaslan, F.N., Cicekli, I.: Text summarization using latent semantic analysis. J. Inf. Sci. 37(4), 405–417 (2011). http://dx.doi.org/10.1177/0165551511408848

  28. Radev, D.R., Allison, T., Blair-Goldensohn, S., Blitzer, J., elebi, A., Dimitrov, S., Drbek, E., Hakim, A., Lam, W., Liu, D., Otterbacher, J., Qi, H., Saggion, H., Teufel, S., Topper, M., Winkel, A., Zhang, Z.: MEAD—a platform for multidocument multilingual text summarization. In: LREC. European Language Resources Association (2004)

    Google Scholar 

  29. Radev, D.R., Hovy, E., McKeown, K.: Introduction to the special issue on summarization. Comput. Linguist. 28(4), 399–408 (2002)

    Article  Google Scholar 

  30. Nenkova, A., Vanderwende, L.: The impact of frequency on summarization. Technical report, Microsoft Research (2005)

    Google Scholar 

  31. Hyderabad blasts—Wikipedia (2013). http://en.wikipedia.org/wiki/2013_Hyderabad_blasts

  32. Sandy Hook Elementary School shooting–Wikipedia (2012). http://en.wikipedia.org/wiki/Sandy_Hook_Elementary_School_shooting

  33. North India floods—Wikipedia (2013). http://en.wikipedia.org/wiki/2013_North_India_floods

  34. Typhoon Hagupit—Wikipedia (2014). http://en.wikipedia.org/wiki/Typhoon_Hagupit

  35. 2015 Nepal earthquake—Wikipedia (2015). http://en.wikipedia.org/wiki/2015_Nepal_earthquake

  36. REST API Resources, Twitter Developers. https://dev.twitter.com/docs/api

  37. Tao, K., Abel, F., Hauff, C., Houben, G.J., Gadiraju, U.: Groundhog day: near-duplicate detection on twitter. In: Proceedings of Conference on World Wide Web (WWW) (2013)

    Google Scholar 

  38. Lin, C.Y.: ROUGE: A package for automatic evaluation of summaries. In: Proceedings of Workshop on Text Summarization Branches Out, ACL, pp. 74–81 (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Soumi Dutta .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Dutta, S., Chandra, V., Mehra, K., Ghatak, S., Das, A.K., Ghosh, S. (2019). Summarizing Microblogs During Emergency Events: A Comparison of Extractive Summarization Algorithms. In: Abraham, A., Dutta, P., Mandal, J., Bhattacharya, A., Dutta, S. (eds) Emerging Technologies in Data Mining and Information Security. Advances in Intelligent Systems and Computing, vol 813. Springer, Singapore. https://doi.org/10.1007/978-981-13-1498-8_76

Download citation

Publish with us

Policies and ethics