, Volume 23, Issue 3, pp 425–447 | Cite as

Spatio-temporal mining of keywords for social media cross-social crawling of emergency events

  • Andrea Autelitano
  • Barbara PerniciEmail author
  • Gabriele Scalia


Being able to automatically extract as much relevant posts as possible from social media in a timely manner is key in many activities, for example to provide useful information in order to rapidly create crisis maps during emergency events. While most social media support keyword-based searches, the amount and the accuracy of retrieved posts depend largely on the keywords employed. The goal of the proposed methodology is to dynamically extract relevant keywords for searching social media during an emergency event, following the event’s evolution. Starting from a set of keywords designed for the type of event being considered (floods and earthquakes, in particular), the set of keywords is automatically adjusted taking into account the spatio-temporal features of the monitored event. The goal is to retrieve posts following the event’s evolution and to benefit from cross-social crawling in order to exploit the specific characteristics of a social media over others. In the case considered in this paper, we exploit the precision of the geolocation of images posted in Flickr to extract keywords to search YouTube posts for the same event, since YouTube does not allow spatial crawling yet provides a richer source of information. The methodology was evaluated on three recent major emergency events, demonstrating a large increase in the number of retrieved posts compared with the use of generic seed keywords. This is a relevant improvement of relevance for providing information on emergency events, and the ability to follow the event’s development.


Media mining Keyword extraction Adaptive crawling Emergency management Social media 



This work was funded by the European Commission H2020 project E2mC “Evolution of Emergency Copernicus services” under project No. 730082. This work expresses the opinions of the authors and not necessarily those of the European Commission. The European Commission is not liable for any use that may be made of the information contained in this work. The authors thank Chiara Francalanci and Paolo Ravanelli for their support throughout this work and Nicole Gervasoni for her support in ground truth analysis and annotations.


  1. 1.
    Ajao O, Hong J, Liu W (2015) A survey of location inference techniques on twitter. J Inf Sci 41(6):855–864CrossRefGoogle Scholar
  2. 2.
    Ao J, Zhang P, Cao Y (2014) Estimating the locations of emergency events from twitter streams. In: Proceedings of the second international conference on information technology and quantitative management, ITQM 2014. National Research University Higher School of Economics (HSE), Moscow, pp 731–739Google Scholar
  3. 3.
    Atefeh F, Khreich W (2015) A survey of techniques for event detection in twitter. Comput Intell 31(1):132–164CrossRefGoogle Scholar
  4. 4.
    Autelitano A (2018) Spatio-temporal cross-social media mining for emergency events, Master’s Thesis, Politecnico di Milano, MilanGoogle Scholar
  5. 5.
    Castillo C (2016) Big crisis data: social media in disasters and time-critical situations. Cambridge University Press, CambridgeCrossRefGoogle Scholar
  6. 6.
    Francalanci C, Guglielmino P, Montalcini M, Scalia G, Pernici B (2017) IMEXT: a method and system to extract geolocated images from tweets—analysis of a case study. In: 2017 11th international conference on research challenges in information science (RCIS). IEEEGoogle Scholar
  7. 7.
    Francalanci C, Pernici B, Scalia G (2017) Exploratory spatio-temporal queries in evolving information. In: Mobility analytics for spatio-temporal and social data - first international workshop, MATES 2017, Munich, Germany, September 1, 2017, Revised Selected Papers, pp 138–156Google Scholar
  8. 8.
    Francalanci C, Pernici B, Scalia G, Zeug G (2018) Talking about places: considering context in geolocation of images extracted from tweets. In: GI-Forum 2018, Issue 1, Salzburg, July 2018, Short paper, pp 243–250Google Scholar
  9. 9.
    Haklay M M, Weber P (2008) OpenStreetMap: user-generated street maps. IEEE Pervasive Comput 7(4):12–18CrossRefGoogle Scholar
  10. 10.
    Hauff C (2013) A study on the accuracy of Flickr’s geotag data. In: Proceedings of the 36th international ACM SIGIR conference on research and development in information retrieval. ACM, pp 1037– 1040Google Scholar
  11. 11.
    Havas C, Resch B, Francalanci C, Pernici B, Scalia G, Fernandez-Marquez J L, Achte T V, Zeug G, Mondardini M R R, Grandoni D, Kirsch B, Kalas M, Lorini V, Rüping S (2017) E2mC: improving emergency management service practice through social media and crowdsourcing analysis in near real time. Sensors 17 (12):2766CrossRefGoogle Scholar
  12. 12.
    Manning C D, Surdeanu M, Bauer J, Finkel J R, Bethard S, McClosky D (2014) The Stanford CoreNLP natural language processing toolkit. In: Proceedings of the 52nd annual meeting of the association for computational linguistics, ACL 2014, Baltimore, pp 55–60Google Scholar
  13. 13.
    Middleton S E, Kordopatis-Zilos G, Papadopoulos S, Kompatsiaris Y (2018) Location extraction from social media: geoparsing, location disambiguation, and geotagging. ACM Trans Inf Syst 36(4):40:1–40:27CrossRefGoogle Scholar
  14. 14.
    Panteras G, Wise S, Lu X, Croitoru A, Crooks A, Stefanidis A (2015) Triangulating social multimedia content for event localization using Flickr and Twitter. Trans GIS 19(5):694–715CrossRefGoogle Scholar
  15. 15.
    Pezanowski S, MacEachren A M, Savelyev A, Robinson A C (2018) Senseplace3: a geovisual framework to analyze place–time–attribute information in social media. Cartogr Geogr Inf Sci 45(5):420– 437CrossRefGoogle Scholar
  16. 16.
    Pohl D, Bouchachia A, Hellwagner H (2012) Automatic identification of crisis-related sub-events using clustering. In: 11th international conference on machine learning and applications, vol 2. ICMLA, Boca Raton, pp 333–338Google Scholar
  17. 17.
    Qu Q, Chen C, Jensen C S, Skovsgaard A (2015) Space-time aware behavioral topic modeling for microblog posts. IEEE Data Eng Bull 38(2):58–67Google Scholar
  18. 18.
    Resch B, Usländer F, Havas C (2018) Combining machine-learning topic models and spatio-temporal analysis of social media data for disaster footprint and damage assessment. Cartogr Geogr Inf Sci (CaGIS) 45(4):362–376.
  19. 19.
    Scalia G (2017) Network-based content geolocation on social media for emergency management, Master’s Thesis, Politecnico di Milano, MilanGoogle Scholar
  20. 20.
    Schubert E, Sander J, Ester M, Kriegel H P, Xu X (2017) Dbscan revisited: why and how you should (still) use dbscan. ACM Trans Database Systems (TODS) 42(3):19CrossRefGoogle Scholar
  21. 21.
    Tamura K, Ichimura T (2013) Density-based spatiotemporal clustering algorithm for extracting bursty areas from georeferenced documents. In: IEEE international conference on systems, man, and cybernetics (SMC), 2013. IEEE, pp 2079–2084Google Scholar
  22. 22.
    Wang X, Tokarchuk L, Cuadrado F, Poslad S (2013) Exploiting hashtags for adaptive microblog crawling. In: Proceedings of the 2013 IEEE/ACM international conference on advances in social networks analysis and mining. ACM, pp 311–315Google Scholar
  23. 23.
    Zheng X, Sun A, Wang S, Han J (2017) Semi-supervised event-related tweet identification with dynamic keyword generation. In: Proceedings of the 2017 ACM on conference on information and knowledge management. ACM, pp 1619–1628Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  1. 1.Politecnico di MilanoMilanoItaly

Personalised recommendations