Skip to main content
Log in

Data fusion for city life event detection

  • Original Research
  • Published:
Journal of Ambient Intelligence and Humanized Computing Aims and scope Submit manuscript

Abstract

The automatic detection of events happening in urban areas from mobile phones’ and social networks’ datasets is an important problem that would enable novel services ranging from city management and emergency response, to social and entertainment applications. In this work we present a simple yet effective method for discovering events from spatio-temporal datasets, based on statistical anomaly detection. Our approach can combine multiple sources of information to improve results. We also present a method to automatically generate a keyword-based description of the events being detected. We run experiments in two cities with data coming from a mobile phone operator (call detail records–CDRs) and from Twitter. We show that this method gives interesting results in terms of precision and recall. We analyze the parameters of our approach and discuss its strengths and weaknesses.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Notes

  1. We use the name Q75 for this method as we are mainly interested in comparison with the upper bound to detect overcrowded events

References

  • Atefeh F, Khreich W (2015) A survey of techniques for event detection in twitter. Comput Intell 31(1):132–164

    Article  MathSciNet  Google Scholar 

  • Bahrepour M, Zhang Y, Meratnia N, Havinga P (2009) Use of event detection approaches for outlier detection in wireless sensor networks. In: Intelligent sensors, sensor networks and information processing (ISSNIP), 2009 5th International Conference on, pp 439–444

  • Baldwin T, Cook P, Han B, Harwood A, Karunasekera S, Moshtaghi M (2012) A support platform for event detection using social intelligence. In: Proceedings of the demonstrations at the 13th conference of the European chapter of the association for computational linguistics, Association for Computational Linguistics, Stroudsburg, EACL ’12, pp 69–72

  • Botta F, Moat H, Preis T (2015) Quantifying crowd size with mobile phone and twitter data. R Soc Open Sci 2(150162)

  • Chandola V, Banerjee A, Kumar V (2009) Anomaly detection: a survey. ACM Comput Surv 41(3):15:1–15:58

  • Dobra A, Williams NE, Eagle N (2015) Spatiotemporal detection of unusual human population behavior using mobile phone data. PLoS One 10(3)

  • Dong Y, Pinelli F, Gkoufas Y, Nabi Z, Calabrese F, Chawla NV (2015) Inferring unusual crowd events from mobile phone call detail records. CoRR. arXiv:1504.03643

  • Douglass RW, Meyer DA, Ram M, Rideout D, Song D (2015) High resolution population estimates from telecommunications data. EPJ Data Sci 4(4)

  • Ferrari L, Mamei M, Colonna M (2014) Discovering events in the city via mobile network analysis. J Ambient Intell Hum Comput 5(3):265–277

    Article  Google Scholar 

  • Guttormsson S, Marks IRJ, El-Sharkawi M, Kerszenbaum I (1999) Elliptical novelty grouping for on-line short-turn detection of excited running rotors. Energy Convers IEEE Trans 14(1):16–22

    Article  Google Scholar 

  • Horn PS, Feng L, Li Y, Pesce AJ (2001) Effect of outliers and nonhealthy individuals on reference interval estimation. Clin Chem 47(12):2137–2145

  • Ihler A, Hutchins J, Smyth P (2006) Adaptive event detection with time-varying poisson processes. In: International conference on knowledge discovery and data mining

  • Jie Zhao XW, Ma Z (2014) Towards events detection from microblog messages. Int J Hybrid Inf Technol 7(1):201–210

    Article  Google Scholar 

  • Li Y, Smeaton AF (2014) From smart cities to smart neighborhoods: detecting local events from social media

  • Neumann J, Zao M, Karatzoglou A, Oliver N (2013) Event detection in communication and transportation data. In: Sanches J, Mic L, Cardoso J (eds) Pattern Recognition and image analysis, vol 7887, Lecture Notes in Computer Science. Springer, Berlin, Heidelberg, pp 827–838

  • Nurwidyantoro A, Winarko E (2013) Event detection in social media: a survey. In: ICT for Smart Society (ICISS), 2013 International Conference on, pp 1–5

  • Popescu AM, Pennacchiotti M (2010) Detecting controversial events from twitter. In: Proceedings of the 19th ACM international conference on information and knowledge management, ACM, New York, CIKM ’10, pp 1873–1876

  • Rozenshtein P, Anagnostopoulos A, Gionis A, Tatti N (2014) Event detection in activity networks. In: Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining, ACM, New York, KDD 14, pp 1176–1185

  • Traag V, Browet A, Calabrese F, Morlot F (2011) Social event detection in massive mobile phone data using probabilistic location inference. In: Privacy, security, risk and trust (PASSAT) and 2011 IEEE Third inernational conference on social computing (SocialCom), 2011 IEEE Third International Conference on, pp 625–628

  • Walther M, Kaisser M (2013) Geo-spatial event detection in the twitter stream. In: Proceedings of the 35th European conference on advances in information retrieval. Springer-Verlag, Berlin, Heidelberg, ECIR’13, pp 356–367

  • Watanabe K, Ochi M, Okabe M, Onai R (2011) Jasmine: a real-time local-event detection system based on geolocation information propagated to microblogs. In: Proceedings of the 20th ACM international conference on information and knowledge management, ACM, New York, CIKM ’11, pp 2541–2544

  • Wilcox RR (2003) 3—summarizing data. In: Wilcox RR (ed) Applying contemporary statistical techniques. Academic Press, Burlington, pp 55 – 91. http://www.sciencedirect.com/science/article/pii/B9780127515410500249

  • Yin J, Hu DH, Yang Q (2009) Spatio-temporal event detection using dynamic conditional random fields. In: Proceedings of the 21st international joint conference on artificial intelligence. Morgan Kaufmann Publishers Inc., San Francisco, IJCAI’09, pp 1321–1326

  • Zambonelli F (2008) Toward sociotechnical urban superorganisms. IEEE Comput 45(8):76–78

    Article  Google Scholar 

  • Zhou X, Chen L (2014) Event detection over twitter social media streams. VLDB J 23(3):381–400

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgments

Source of the Dataset: Telecom Italia Big Data Challenge 2014, http://www.telecomitalia.com/bigdatachallenge.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alket Cecaj.

Appendix: Analysis of parameters

Appendix: Analysis of parameters

In this appendix we make a summary of the several experiments and results we obtained to optimize the parameters involved in our event detection approach. As described in the Sect. 2 our approach is based on four key elements:

  • The granularity of the grid used to tessellate the environment.

  • The approach used to compute the normality interval: IQR, Median or Q75

  • The value k setting the length of the normality interval

  • the threshold t used to filter out minor anomalies

In the following we report experiments testing precision and recall of our approach for different data and parameters configurations. Specifically:

  • Figure 9 deals with experiments with SMS-Call data

  • Figure 10 deals with experiments with Internet data

  • Figure 11 deals with experiments with social network data

In each figure we report experiments both from Milan and Trento provinces. In each figure, subfigures a,b,c illustrate results for IQR, Median and Q75 approaches respectively. Each subfigure shows results in terms of precision and recall for different values of k in the interval 0.5 and 3. Specifically each plot is obtained by varying the t value. When t is small, almost all the outliers are considered events thus typically yielding high recall, but low precision. Vice versa when t is large, only outliers associated to a very large number of people (i.e. high levels of activity associated to the given cell) are considered events thus typically yielding high precision, but low recall. The analysis of these results allow us to identify the best approach (among IQR, Median and Q75) and the best value of k to identify meaningful events. We consider instead the t value as a free parameter to be set according to the cost of false-positive and false-negative that is application dependent.

Subfigure d is obtained by selecting the best combination of the previous parameters and changing the grid resolution in areas of 4, 9, 16 and so on and so forth cells.

Results indicate the following best results:

  • SMS-call data–Milan: Median, k = 3, cells of 690 m of side

  • SMS-call data– Trento: Median, k = 0.5, cells of 9000 m of side

  • Internet data–Milan: Median, k = 1.5, cells of 2070 m of side

  • Internet data–Trento: Median, k = 1.5, cells of 9000 m of side

  • Social data–Milan: Median k = 3.0 cells of 2070 m of side

  • Social data–Trento: Median k = any, cells of 9000 m of side

In general, it is possible to see a big different between cell size for Milan and Trento. In Trento case the social dataset is very sparse and the results of precision and recall tend to be the same with every method. The cell size that gives better results is 9000 m of side. This is due to the much different population density in Milan and in the province of Trento which differs by the same order of magnitude.

Fig. 9
figure 9

SMS-call data—a IQR method for diff. k values. b Median method for diff. k values. c Q75 methods for diff. k values. d best method for diff. cell sizes

Fig. 10
figure 10

Internet data—a IQR method for diff. k values. b Median method for diff. k values. c Q75 methods for diff. k values. d best method for diff. cell sizes

Fig. 11
figure 11

Social data—a IQR method for diff. k values. b Median method for diff. k values. c Q75 methods for diff. k values. d best method for diff. cell sizes

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Cecaj, A., Mamei, M. Data fusion for city life event detection. J Ambient Intell Human Comput 8, 117–131 (2017). https://doi.org/10.1007/s12652-016-0354-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12652-016-0354-7

Keywords

Navigation