Abstract
Microblogging sites, notably Twitter, have become important sources of real-time situational information during emergency events. Since hundreds to thousands of microblogs (tweets) are generally posted on Twitter during an emergency event, manually going through every tweet is not feasible. Hence, summarization of microblogs posted during emergency events has become an important problem in recent years. Several summarization algorithms have been proposed in the literature, both for general document summarization, as well as specifically for summarization of microblogs. However, to our knowledge, there has not been any systematic analysis on which algorithms are more suitable for summarization of microblogs posted during disasters. In this work, we evaluate and compare the performance of 8 extractive summarization algorithms in the application of summarizing microblogs posted during emergency events. Apart from comparing the performances of the algorithms, we also find significant differences among the summaries produced by different algorithms over the same input data.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Availability of implementations: Frequency Summarizer (http://glowingpython.blogspot.in/2014/09/text-summarization-with-nltk.html), Mead (http://www.summarization.com/mead/), SumBasic (https://github.com/EthanMacdonald/SumBasic), LexRank, LSA and LUHN are available as part of the Python Sumy package (https://pypi.python.org/pypi/sumy). COWTS (proposed in our prior work [5]) and ClusterRank were implemented by us.
References
Imran, M., Castillo, C., Diaz, F., Vieweg, S.: Processing social media messages in mass emergency: a survey. ACM Comput. Surv. 47(4), 67:1–67:38 (2015)
Das, D., Martins, A.F.: A survey on automatic text summarization. Lit. Surv. Lang. Stat. II Course CMU 4, 192–195 (2007)
Gupta, V., Lehal, G.S.: A survey of text summarization extractive techniques. IEEE J. Emerg. Technol. Web Intell. 2(3), 258–268 (2010)
Luhn, H.P.: The automatic creation of literature abstracts. IBM J. Res. Dev. 2(2), 159–165 (1958)
Rudra, K., Ghosh, S., Goyal, P., Ganguly, N., Ghosh, S.: Extracting situational information from microblogs during disaster events: a classification-summarization approach. In: Proceedings of ACM CIKM (2015)
Olariu, A.: Efficient online summarization of microblogging streams. In: Proceedings of EACL(short paper), pp. 236–240 (2014)
Shou, L., Wang, Z., Chen, K., Chen, G.: Sumblr: continuous summarization of evolving tweet streams. In: Proceedings of ACM SIGIR (2013)
Wang, Z., Shou, L., Chen, K., Chen, G., Mehrotra, S.: On summarization and timeline generation for evolutionary tweet streams. IEEE Trans. Knowl. Data Eng. 27, 1301–1314 (2015)
Zubiaga, A., Spina, D., Amigo, E., Gonzalo, J.: Towards real-time summarization of scheduled events from twitter streams. In: Hypertext(Poster) (2012)
Erkan, G., Radev, D.R.: LexRank: Graph-Based Lexical Centrality as Salience in Text Summarization, pp. 457–479 (2004)
Dutta, S., Ghatak, S., Roy, M., Ghosh, S., Das, A.K.: A graph based clustering technique for tweet summarization. In: 2015 4th International Conference on Reliability, Infocom Technologies and Optimization (ICRITO)(Trends and Future Directions), pp. 1–6. IEEE (2015)
Xu, W., Grishman, R., Meyers, A., Ritter, A.: A preliminary study of tweet summarization using information extraction. In: Proceedings of NAACL 2013, 20 (2013)
Chakrabarti, D., Punera, K.: Event summarization using tweets. In: Proceedings of AAAI ICWSM, pp. 340–348 (2011)
Gillani, M., Ilyas, M.U., Saleh, S., Alowibdi, J.S., Aljohani, N., Alotaibi, F.S.: Post summarization of microblogs of sporting events. In: Proceedings of International Conference on World Wide Web (WWW) Companion, pp. 59–68 (2017)
Khan, M.A.H., Bollegala, D., Liu, G., Sezaki, K.: Multi-tweet summarization of real-time events. In: Socialcom (2013)
Nichols, J., Mahmud, J., Drews, C.: Summarizing sporting events using twitter. In: Proceedings of ACM International Conference on Intelligent User Interfaces (IUI), pp. 189–198 (2012)
Takamura, H., Yokono, H., Okumura, M.: Summarizing a document stream. In: Proceedings of ECIR (2011)
Osborne, M., Moran, S., McCreadie, R., Lunen, A.V., Sykora, M., Cano, E., Ireson, N., Macdonald, C., Ounis, I., He, Y., Jackson, T., Ciravegna, F., OBrien, A.: Real-time detection, tracking, and monitoring of automatically discovered events in social media. In: Proceedings of ACL (2014)
Kedzie, C., McKeown, K., Diaz, F.: Predicting salient updates for disaster summarization. In: Proceedings of ACL (2015)
Nguyen, M.T., Kitamoto, A., Nguyen, T.T.: Tsum4act: a framework for retrieving and summarizing actionable tweets during a disaster for reaction. In: Proceedings of PAKDD (2015)
Inouye, D.I., Kalita, J.K.: Comparing twitter summarization algorithms for multiple post summaries. In: Proceedings of IEEE SocialCom/PASSAT, pp. 298–306 (2011)
Mackie, S., McCreadie, R., Macdonald, C., Ounis, I.: Comparing algorithms for microblog summarisation. In: Proceedings of CLEF (2014)
Rosa, K.D., Shah, R., Lin, B., Gershman, A., Frederking, R.: Topical Clustering of Tweets
Garg, N., Favre, B., Riedhammer, K., Hakkani-Tr, D.: Clusterrank: a graph based method for meeting summarization. In: INTERSPEECH, pp. 1499–1502. ISCA (2009)
Erkan, G., Radev, D.R.: Lexrank: Graph-based lexical centrality as salience in text summarization. J. Artif. Int. Res. 22(1), 457–479 (2004)
Gong, Y., Liu, X.: Generic text summarization using relevance measure and latent semantic analysis. In: SIGIR, pp. 19–25 (2001)
Ozsoy, M.G., Alpaslan, F.N., Cicekli, I.: Text summarization using latent semantic analysis. J. Inf. Sci. 37(4), 405–417 (2011). http://dx.doi.org/10.1177/0165551511408848
Radev, D.R., Allison, T., Blair-Goldensohn, S., Blitzer, J., elebi, A., Dimitrov, S., Drbek, E., Hakim, A., Lam, W., Liu, D., Otterbacher, J., Qi, H., Saggion, H., Teufel, S., Topper, M., Winkel, A., Zhang, Z.: MEAD—a platform for multidocument multilingual text summarization. In: LREC. European Language Resources Association (2004)
Radev, D.R., Hovy, E., McKeown, K.: Introduction to the special issue on summarization. Comput. Linguist. 28(4), 399–408 (2002)
Nenkova, A., Vanderwende, L.: The impact of frequency on summarization. Technical report, Microsoft Research (2005)
Hyderabad blasts—Wikipedia (2013). http://en.wikipedia.org/wiki/2013_Hyderabad_blasts
Sandy Hook Elementary School shooting–Wikipedia (2012). http://en.wikipedia.org/wiki/Sandy_Hook_Elementary_School_shooting
North India floods—Wikipedia (2013). http://en.wikipedia.org/wiki/2013_North_India_floods
Typhoon Hagupit—Wikipedia (2014). http://en.wikipedia.org/wiki/Typhoon_Hagupit
2015 Nepal earthquake—Wikipedia (2015). http://en.wikipedia.org/wiki/2015_Nepal_earthquake
REST API Resources, Twitter Developers. https://dev.twitter.com/docs/api
Tao, K., Abel, F., Hauff, C., Houben, G.J., Gadiraju, U.: Groundhog day: near-duplicate detection on twitter. In: Proceedings of Conference on World Wide Web (WWW) (2013)
Lin, C.Y.: ROUGE: A package for automatic evaluation of summaries. In: Proceedings of Workshop on Text Summarization Branches Out, ACL, pp. 74–81 (2004)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Dutta, S., Chandra, V., Mehra, K., Ghatak, S., Das, A.K., Ghosh, S. (2019). Summarizing Microblogs During Emergency Events: A Comparison of Extractive Summarization Algorithms. In: Abraham, A., Dutta, P., Mandal, J., Bhattacharya, A., Dutta, S. (eds) Emerging Technologies in Data Mining and Information Security. Advances in Intelligent Systems and Computing, vol 813. Springer, Singapore. https://doi.org/10.1007/978-981-13-1498-8_76
Download citation
DOI: https://doi.org/10.1007/978-981-13-1498-8_76
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-1497-1
Online ISBN: 978-981-13-1498-8
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)