Skip to main content
Log in

DEES: a real-time system for event extraction from disaster-related web text

  • Original Article
  • Published:
Social Network Analysis and Mining Aims and scope Submit manuscript

Abstract

The rapid growth of Internet-based communication technologies in the form of Social Media (SM) and associated mobile applications has enabled people to share information related to disaster events in “real-time” as they unfold. People are increasingly using SM platforms to report situational information during disasters, such as critical needs, dead or injured people, and property damage. Despite their usefulness, the majority of this pertinent data is not available to humanitarian organisations during emergencies, mainly due to several data processing and data quality issues. The proliferation of online news media has also led to the exchange of a massive amount of information during disasters, mostly validated by official sources. The integration of SM data with online news reports can provide filtered information while adding more details on the progress of an event than is already published in online news. This research project introduces Disaster Event Extraction System (DEES), a real-time system for extracting disaster events from both online news and tweets. DEES is evaluated on a dataset collected during the Nepal earthquake in 2015. Our results suggest that integrating both SM and news text data improves the event extraction system’s performance compared to using SM data alone. A demonstration of DEES is available at: https://mu-clab.github.io/.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Data availability

Not applicable.

Notes

  1. rnz news, https://www.rnz.co.nz/.

  2. nzherald news, https://www.nzherald.co.nz/.

  3. stuff news, https://www.stuff.co.nz/.

  4. Python tweepy library version 3.9.0, https://pypi.org/project/tweepy/.

  5. Python Spacy version 2.3.2, https://spacy.io/.

  6. DBSCAN algorithm, https://scikit-learn.org/stable/modules/generated/sklearn.cluster.DBSCAN.html.

  7. Python newspaper3k, https://newspaper.readthedocs.io/en/latest/.

  8. NER implementation in Spacy, https://spacy.io/api/annotation#named-entities.

  9. Nominatim version 3.5.1, https://github.com/osm-search/Nominatim.

  10. OpenStreetMap,https://www.openstreetmap.org/#map=2/-41.2/-6.6.

  11. geopy 2.2.0 https://pypi.org/project/geopy/.

  12. CrisisNLPdataset, https://crisisnlp.qcri.org/.

References

  • Abel F, Hauff C, Houben G-J, Stronkman R, Tao K (2012) Twitcident: fighting fire with information from social web streams. In: Proceedings of the 21st international conference on World Wide Web, pp 305–308. https://doi.org/10.1145/2187980.2188035

  • Ahmad T, Ramsay A (2016) Linking tweets to news: is all news of interest? In: International conference on artificial intelligence: methodology, systems, and applications, Springer, pp 151–161 . https://doi.org/10.1007/978-3-319-44748-3_15

  • Alam F, Joty SR, Imran M (2018) Graph based semi-supervised learning with convolution neural networks to classify crisis related tweets. In: Proceedings of the twelfth international conference on Web and Social Media, ICWSM 2018, Stanford, California, USA, June 25–28, 2018, pp 556–559. AAAI Press. https://aaai.org/ocs/index.php/ICWSM/ICWSM18/paper/view/17815

  • Alam F, Joty S, Imran M (2018) Domain adaptation with adversarial training and graph embeddings. arXiv preprint arXiv:1805.05151. https://doi.org/10.18653/v1/P18-1099

  • Algiriyage N, Sampath R, Prasanna R, Doyle EE, Stock K, Johnston D (2021) Identifying disaster-related tweets: a large-scale detection model comparison

  • Alomari E, Mehmood R, Katib I (2019) Road traffic event detection using twitter data, machine learning, and apache spark. In: 2019 IEEE SmartWorld, ubiquitous intelligence & computing, advanced & trusted computing, scalable computing & communications, cloud & big data computing, internet of people and smart city innovation (SmartWorld/SCALCOM/UIC/ATC/CBDCom/IOP/SCI), IEEE, pp 1888–1895 . https://doi.org/10.1109/SmartWorld-UIC-ATC-SCALCOM-IOP-SCI.2019.00332

  • Alsaedi N, Burnap P, Rana O (2017) Can we predict a riot? disruptive event detection using twitter. ACM Trans Internet Technol (TOIT) 17(2):1–26. https://doi.org/10.1145/2996183

    Article  Google Scholar 

  • Ashktorab Z, Brown C, Nandi M, Culotta A (2014) Tweedr: mining twitter to inform disaster response. In: ISCRAM, pp 269–272

  • Bhoi A, Pujari SP, Balabantaray RC (2020) A deep learning-based social media text analysis framework for disaster resource management. Soc Netw Anal Min 10(1):78. https://doi.org/10.1007/s13278-020-00692-1

    Article  Google Scholar 

  • Caragea C, McNeese NJ, Jaiswal AR, Traylor G, Kim H-W, Mitra P, Wu D, Tapia AH, Giles CL, Jansen BJ et al (2011) Classifying text messages for the haiti earthquake. In: ISCRAM . Citeseer

  • Colic N, Rinaldi F (2019) Improving spacy dependency annotation and pos tagging web service using independent NER services. Genom Inform. https://doi.org/10.5808/GI.2019.17.2.e21

    Article  Google Scholar 

  • Cretulescu RG, Morariu D, Breazu M, Volovici D (2019) Dbscan algorithm for document clustering. Int J Adv Stat IT &C Econ Life Sci. https://doi.org/10.2478/ijasitels-2019-0007

    Article  Google Scholar 

  • Danovitch J (2020) Linking social media posts to news with siamese transformers. CoRR arXiv:abs/2001.03303

  • Dhavase N, Bagade A (2014) Location identification for crime & disaster events by geoparsing twitter. In: International conference for convergence for technology-2014, IEEE, pp 1–3

  • Endsley MR (1995) Toward a theory of situation awareness in dynamic systems. Hum Fact 37(1):32–64. https://doi.org/10.1518/001872095779049543

    Article  Google Scholar 

  • Ester M, Kriegel H-P, Sander J, Xu X, et al (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In: Kdd, vol. 96, pp 226–231. http://www.aaai.org/Library/KDD/1996/kdd96-037.php

  • Guo W, Li H, Ji H, Diab M (2013) Linking tweets to news: a framework to enrich short text data in social media. In: Proceedings of the 51st annual meeting of the association for computational linguistics, Vol. 1, Long Papers, pp 239–249

  • Gupta D, Strötgen J, Berberich K (2016) Eventminer: mining events from annotated documents. In: Proceedings of the 2016 ACM international conference on the theory of information retrieval, pp 261–270. https://doi.org/10.1145/2970398.2970411

  • Ha H, Hwang B-Y (2016) Keyword filtering about disaster and the method of detecting area in detecting real-time event using twitter. KIPS Trans Softw Data Eng 5(7):345–350

    Article  Google Scholar 

  • Hagras M, Hassan G, Farag N (2017) Towards natural disasters detection from twitter using topic modelling. In: 2017 European conference on electrical engineering and computer science (EECS), IEEE, pp 272–279 . https://doi.org/10.1109/EECS.2017.57

  • Hamborg F, Breitinger C, Gipp B (2019) Giveme5w1h: A universal system for extracting main events from news articles. In: Özgöbek, Ö., Kille, B., Gulla, J.A., Lommatzsch, A. (eds.) Proceedings of the 7th International workshop on news recommendation and analytics in conjunction with 13th ACM conference on recommender systems, INRA@RecSys 2019, Copenhagen, Denmark, September 20, 2019. CEUR Workshop Proceedings, vol. 2554, pp 35–43. CEUR-WS.org. http://ceur-ws.org/Vol-2554/paper_06.pdf

  • Hamborg F, Lachnit S, Schubotz M, Hepp T, Gipp B (2018) Giveme5w: main event retrieval from news articles by extraction of the five journalistic w questions. In: International conference on information, Springer, pp 356–366 . https://doi.org/10.1007/978-3-319-78105-1_39

  • Hamborg F, Lachnit S, Schubotz M, Hepp T, Gipp B (2018) Giveme5w: main event retrieval from news articles by extraction of the five journalistic w questions. In: International conference on information, Springer, pp 356–366

  • Han X, Wang J (2019) Earthquake information extraction and comparison from different sources based on web text. ISPRS Int J Geo-Inf 8(6):252. https://doi.org/10.3390/ijgi8060252

    Article  Google Scholar 

  • Han X, Wang J (2019) Using social media to mine and analyze public sentiment during a disaster: a case study of the 2018 Shouguang city flood in china. ISPRS Int J Geo-Inf 8(4):185. https://doi.org/10.3390/ijgi8040185

    Article  Google Scholar 

  • Imran M, Castillo C, Diaz F, Vieweg S (2015) Processing social media messages in mass emergency: a survey. ACM Comput Surv (CSUR) 47(4):67. https://doi.org/10.1145/2771588

    Article  Google Scholar 

  • Imran M, Alam F, Qazi U, Peterson S, Ofli F (2020) Rapid damage assessment using social media images by combining human and machine intelligence. CoRR arXiv:2004.06675

  • Imran M, Castillo C, Lucas J, Meier P, Vieweg S (2014) Aidr: artificial intelligence for disaster response. In: 23rd international World Wide Web conference, WWW ’14, Seoul, Republic of Korea, April 7–11, 2014, Companion Volume, pp 159–162 . https://doi.org/10.1145/2567948.2577034

  • Imran M, Elbassuoni S, Castillo C, Diaz F, Meier P (2013) Extracting information nuggets from disaster-related messages in social media. In: 10th proceedings of the international conference on information systems for crisis response and management, Baden-Baden, Germany, May 12–15, 2013. ISCRAM Association. http://idl.iscram.org/files/imran/2013/613_Imran_etal2013.pdf

  • Imran M, Elbassuoni S, Castillo C, Diaz F, Meier P (2013) Practical extraction of disaster-relevant information from social media. In: Proceedings of the 22nd international conference on World Wide Web, pp 1021–1024 . https://doi.org/10.1145/2487788.2488109

  • Interdonato R, Guillaume J-L, Doucet A (2019) A lightweight and multilingual framework for crisis information extraction from twitter data. Soc Netw Anal Min 9(1):65. https://doi.org/10.1007/s13278-019-0608-4

    Article  Google Scholar 

  • Jang B, Kim I, Kim JW (2019) Word2vec convolutional neural networks for classification of news articles and tweets. PloS One 14(8):0220976

    Article  Google Scholar 

  • Kalyanam J, Quezada M, Poblete B, Lanckriet G (2016) Prediction and characterization of high-activity events in social media triggered by real-world news. PloS One 11(12):0166694

    Article  Google Scholar 

  • Karami A, Shah V, Vaezi R, Bansal A (2020) Twitter speaks: a case of national disaster situational awareness. J Inf Sci 46(3):313–324. https://doi.org/10.1177/0165551519828620

    Article  Google Scholar 

  • Kekäläinen J, Järvelin K (2002) Using graded relevance assessments in IR evaluation. J Assoc Inf Sci Technol 53(13):1120–1129. https://doi.org/10.1002/asi.10137

    Article  Google Scholar 

  • Li H, Guevara N, Herndon N, Caragea D, Neppalli K, Caragea C, Squicciarini AC, Tapia AH (2015) Twitter mining for disaster response: a domain adaptation approach. In: ISCRAM. http://idl.iscram.org/files/hongminli/2015/1234_HongminLi_etal2015.pdf

  • Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013) Distributed representations of words and phrases and their compositionality. In: Advances in neural information processing systems 26: 27th annual conference on neural information processing systems 2013. Proceedings of a meeting held December 5–8, 2013, Lake Tahoe, Nevada, United States, pp 3111–3119

  • Neppalli VK, Caragea C, Caragea D (2018) Deep neural networks versus naive bayes classifiers for identifying informative tweets during disasters. In: ISCRAM

  • Norambuena BK, Horning M, Mitra T (2020) Evaluating the inverted pyramid structure through automatic 5w1h extraction and summarization. In: Computational journalism symposium

  • Pandhare KR, Shah MA (2017) Real time road traffic event detection using twitter and spark. In: 2017 International conference on inventive communication and computational technologies (ICICCT), IEEE, pp 445–449

  • Petroni F, Raman N, Nugent T, Nourbakhsh A, Panić Ž, Shah S, Leidner J (2018) An extensible event extraction system with cross-media event resolution. In: Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining, pp 626–635 . https://doi.org/10.1145/3219819.3219827

  • Piskorski J, Tanev H, Atkinson M, Van Der Goot E, Zavarella V (2011) Online news event extraction for global crisis surveillance. In: Transactions on computational collective intelligence V, Springer, pp 182–212. https://doi.org/10.1007/978-3-642-24016-4_10

  • Rhodan M (2017) Please send help. Hurricane harvey victims turn to twitter and facebook. https://time.com/4921961/hurricane-harvey-twitter-facebook-social-media/. Accessed 10 Nov 2020

  • Rogstadius J, Vukovic M, Teixeira CA, Kostakos V, Karapanos E, Laredo JA (2013) Crisistracker: crowdsourced social media curation for disaster awareness. IBM J Res Dev 57(5):4:1-4:13. https://doi.org/10.1147/JRD.2013.2260692

    Article  Google Scholar 

  • Sakaki T, Okazaki M, Matsuo Y (2010) Earthquake shakes twitter users: real-time event detection by social sensors. In: Proceedings of the 19th international conference on World Wide Web, pp 851–860 . https://doi.org/10.1145/1772690.1772777

  • Sen A, Rudra K, Ghosh S (2015) Extracting situational awareness from microblogs during disaster events. In: 2015 7th international conference on communication systems and networks (COMSNETS), IEEE, pp 1–6. https://doi.org/10.1109/COMSNETS.2015.7098720

  • Shrestha P, Jacquin C, Daille B (2012) Clustering short text and its evaluation. In: International conference on intelligent text processing and computational linguistics, Springer, pp 169–180

  • Tanev H, Piskorski J, Atkinson M (2008) Real-time news event extraction for global monitoring systems. Jt Res Center Eur Comm Web Lang Technol Group IPSC, TP 267. https://www.researchgate.net/publication/221474287_Real-Time_News_Event_Extraction_for_Global_Crisis_Monitoring

  • Téllez-Valero A, Montes-y-Gómez M, Pineda LV (2009) Using machine learning for extracting information from natural disaster news reports. Computación y Sistemas 13(1):33–44

    Google Scholar 

  • Verma R, Karimi S, Lee D, Gnawali O, Shakery A (2019) Newswire versus social media for disaster response and recovery. In: 2019 resilience week (RWS), IEEE, vol. 1, pp 132–141

  • Wang KSHW (2010) Representing dynamic phenomena based on spatiotemporal information extracted from web documents. In: Extended abstracts, GIScience conference 2010

  • Wang W, Stewart K (2015) Spatiotemporal and semantic information extraction from web news reports about natural hazards. Comput Environ Urban Syst 50:30–40. https://doi.org/10.1016/j.compenvurbsys.2014.11.001

    Article  Google Scholar 

  • Wang Z, Ye X (2018) Social media analytics for natural disaster management. Int J Geogr Inf Sci 32(1):49–72. https://doi.org/10.1080/13658816.2017.1367003

    Article  Google Scholar 

  • Wang Z, Ye X (2019) Space, time, and situational awareness in natural hazards: a case study of hurricane sandy with social media data. Cartogr Geogr Inf Sci 46(4):334–346

    Article  MathSciNet  Google Scholar 

  • Wang Z, Ye X, Tsou M-H (2016) Spatial, temporal, and content analysis of twitter for wildfire hazards. Nat Hazards 83(1):523–540

    Article  Google Scholar 

  • Wanichayapong N, Pruthipunyaskul W, Pattara-Atikom W, Chaovalit P (2011) Social-based traffic information extraction and classification. In: 2011 11th international conference on ITS telecommunications, IEEE, pp 107–112

  • Wiegmann M, Kersten J, Klan F, Potthast M, Stein B (2020) Analysis of detection models for disaster-related tweets. Anal Detect Mod Disaster-Relat Tweets, 872–880

  • Yuan F, Liu R (2020) Mining social media data for rapid damage assessment during hurricane Matthew: feasibility study. J Comput Civil Eng 34(3):05020001

    Article  Google Scholar 

  • Yun H (2011) Disaster events detection using twitter data. J Inf Commun converg Eng 9(1):69–73. https://doi.org/10.6109/jicce.2011.9.1.069

    Article  Google Scholar 

Download references

Funding

Not applicable.

Author information

Authors and Affiliations

Authors

Contributions

The authors confirm their contribution to the paper as follows: study conception and design, data collection, analysis and interpretation of results, and draft manuscript preparation: Nilani Algiriyage. All authors reviewed the results and approved the final version of the manuscript.

Corresponding author

Correspondence to Nilani Algiriyage.

Ethics declarations

Conflict of interest

Not applicable.

Ethical Approval

Not applicable.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Algiriyage, N., Prasanna, R., Stock, K. et al. DEES: a real-time system for event extraction from disaster-related web text. Soc. Netw. Anal. Min. 13, 6 (2023). https://doi.org/10.1007/s13278-022-01007-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s13278-022-01007-2

Keywords

Navigation