Efficient online extraction of keywords for localized events in twitter

Abdelhaq, Hamed; Gertz, Michael; Armiti, Ayser

doi:10.1007/s10707-016-0258-x

Efficient online extraction of keywords for localized events in twitter

Published: 30 April 2016

Volume 21, pages 365–388, (2017)
Cite this article

GeoInformatica Aims and scope Submit manuscript

Hamed Abdelhaq¹,
Michael Gertz² &
Ayser Armiti¹

1088 Accesses
17 Citations
Explore all metrics

Abstract

Messages published via social media sites, such as Twitter, Facebook, and Foursquare hide a considerable amount of information about real world events. The timely identification of such events from this huge, unstructured, and noisy user-generated content plays an important role in increasing situation awareness and in supporting useful applications such as recommendation systems. Interestingly, a large number of these messages are enriched with location information, due to the recent advancements of today’s location acquisition techniques. This, in turn, enables location-aware event mining, i.e., the detection and tracking of localized events such as sport events, demonstrations, or traffic jams, to name but a few. The main building blocks of a localized event are local keywords that exhibit a surge in usage at the event location. In this paper, we propose an approach that aims at extracting local keywords from a stream of Twitter messages by (1) identifying local keywords, and (2) estimating the central location of each keyword. This extraction procedure is performed in an online fashion using a sliding window over the Twitter stream. Additionally, we address the problem of spatial outliers that adversely affect a sound identification of local keywords. Spatial outliers occur when people far away from the location of an event use related keywords in their Tweets. We handle this problem by adjusting the spatial distribution of keywords based on their co-occurrence with place names that may refer to the location of an event. To ensure scalability, we utilize a hierarchical spatial index to gradually prune the geographic space and thus to efficiently perform complex spatial computations. Extensive comparative experiments are conducted using Twitter data. The analysis of the experimental results demonstrates the superiority of our approach over existing methods in terms of efficiency and precision of the obtained results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Notes

¹ https://about.twitter.com/company, accessed Dec. 2015.
² http://www.openstreetmap.org/.
³ http://open.mapquestapi.com/nominatim.
⁴ http://www.movable-type.co.uk/scripts/latlong.html.
⁵ http://www.geomidpoint.com/.
⁶ http://dev.twitter.com/pages/streaming_api.

References

Abdelhaq H, Gertz M (2014) On the locality of keywords in Twitter streams. In: IWGS ’14, pp 12–20
Abdelhaq H, Gertz M, Sengstock C (2013) Spatio-temporal characteristics of bursty words in Twitter streams. In: SIGSPATIAL ’13, pp 149–158
Abdelhaq H, Sengstock C, Gertz M (2013) EvenTweet: online localized event detection from Twitter. Proc VLDB Endow 6(12):1326–1329
Article Google Scholar
Aggarwal CC, Subbian K (2012) Event detection in social streams. In: SDM ’12, pp 624–635
Allan J (ed) (2002) Topic detection and tracking: event-based information organization. Kluwer Academic Publishers, Norwell
Alvanaki F, Michel S, Ramamritham K, Weikum G (2012) See what’s enBlogue: real-time emergent topic identification in social media. In: EDBT ’12, pp 336–347
Backstrom L, Kleinberg J, Kumar R, Novak J (2008) Spatial variation in search engine queries. In: WWW ’08, pp 357–366
Becker H, Naaman M, Gravano L (2011) Beyond trending topics: real-world event identification on Twitter. In: ICWSM ’11
Boettcher A, Lee D (2012) EventRadar: a real-time local event detection scheme using Twitter stream. In: GreenCom ’12, pp 358–367
Cataldi M, Di Caro L, Schifanella C (2010) Emerging topic detection on Twitter based on temporal and social terms evaluation. In: MDMKDD ’10, pp 4:1–4:10
Chen L, Roy A (2009) Event detection from Flickr data through wavelet-based spatial analysis. In: CIKM ’09, pp 523–532
Chunara R, Andrews JR, Brownstein JS (2012) Social and news media enable estimation of epidemiological patterns early in the 2010 Haitian cholera outbreak. Am J Trop Med Hyg 86(1):39–45
Article Google Scholar
Kleinberg J (2003) Bursty and hierarchical structure in streams. Data Min Knowl Discov 4:373–397
Lappas T, Arai B, Platakis M, Kotsakos D, Gunopulos D (2009) On burstiness-aware search for document sequences. In: KDD ’09, pp 477–486
Lappas T, Vieira MR, Gunopulos D, Tsotras VJ (2012) On the spatiotemporal burstiness of terms. PVLDB 5(9):836–847
Google Scholar
Lee CH, Wu CH, Chien TF (2011) BursT: a dynamic term weighting scheme for mining microblogging messages. In: ISNN ’11, pp 548–557
Li C, Sun A, Datta A (2012) Twevent: segment-based event detection from tweets. In: CIKM ’12, pp 155–164
Magdy A, Mokbel MF, Elnikety S, Nath S, He Y (2014) Mercury: A memory-constrained spatio-temporal real-time search on microblogs. In: ICDE ’14, pp 172–183
Morstatter F, Pfeffer J, Liu H, Carley KM (2013) Is the sample good enough? comparing data from Twitter’s streaming API with Twitter’s firehose. In: ICWSM ’13
Petrovic S, Osborne M, Lavrenko V (2010) Streaming first story detection with application to Twitter. In: HLT ’10, pp 181–189
Rattenbury T, Good N, Naaman M (2007) Towards automatic extraction of event and place semantics from Flickr tags. In: SIGIR ’07, pp 103–110
Sakaki T, Okazaki M, Matsuo Y (2010) Earthquake shakes Twitter users: real-time event detection by social sensors. In: WWW ’10, pp 851–860
Samet H (1990) Applications of spatial data structures: computer graphics, image processing and GIS. Addison-Wesley
Sankaranarayanan J, Samet H, Teitler BE, Lieberman MD, Sperling J (2009) TwitterStand: news in tweets. In: GIS ’09, pp 42–51
Skovsgaard A, Sidlauskas D, Jensen C (2014) Scalable top-k spatio-temporal term querying. In: ICDE ’14, pp 148–159
Tanimoto S, Pavlidis T (1975) A hierarchical data structure for picture processing. Comput Vision Graph 4(2):104–119
Google Scholar
Valkanas G, Gunopulos D (2013) How the live web feels about events. In: CIKM ’13, pp 639–648
Vlachos M, Meek C, Vagena Z, Gunopulos D (2004) Identifying similarities, periodicities and bursts for online search queries. In: SIGMOD ’04, pp 131–142
Watanabe K, Ochi M, Okabe M, Onai R (2011) Jasmine: a real-time local-event detection system based on geolocation information propagated to microblogs. In: CIKM ’11, pp 2541–2544
Weng J, Lee BS (2011) Event detection in Twitter. In: ICWSM ’11
Yang Y, Pierce T, Carbonell J (1998) A study of retrospective and on-line event detection. In: SIGIR ’98, pp 28–36
Zhou X, Chen L (2014) Event detection over Twitter social media streams. VLDB J 23(3):381–400
Article Google Scholar

Download references

Author information

Authors and Affiliations

moovel Group GmbH, Stuttgart, Germany
Hamed Abdelhaq & Ayser Armiti
Heidelberg University, Heidelberg, Germany
Michael Gertz

Authors

Hamed Abdelhaq
View author publications
You can also search for this author in PubMed Google Scholar
Michael Gertz
View author publications
You can also search for this author in PubMed Google Scholar
Ayser Armiti
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Michael Gertz.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Abdelhaq, H., Gertz, M. & Armiti, A. Efficient online extraction of keywords for localized events in twitter. Geoinformatica 21, 365–388 (2017). https://doi.org/10.1007/s10707-016-0258-x

Download citation

Received: 20 January 2016
Accepted: 12 April 2016
Published: 30 April 2016
Issue Date: April 2017
DOI: https://doi.org/10.1007/s10707-016-0258-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Efficient online extraction of keywords for localized events in twitter

Abstract

Access this article

Similar content being viewed by others

A survey on location estimation techniques for events detected in Twitter

Sense and Focus: Towards Effective Location Inference and Event Detection on Twitter

EventStory: Event Detection Using Twitter Stream Based on Locality

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Efficient online extraction of keywords for localized events in twitter

Abstract

Access this article

Similar content being viewed by others

A survey on location estimation techniques for events detected in Twitter

Sense and Focus: Towards Effective Location Inference and Event Detection on Twitter

EventStory: Event Detection Using Twitter Stream Based on Locality

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation