Skip to main content
Log in

CrimeTelescope: crime hotspot prediction based on urban and social media data fusion

  • Published:
World Wide Web Aims and scope Submit manuscript

Abstract

Crime is a complex social issue impacting a considerable number of individuals within a society. Preventing and reducing crime is a top priority in many countries. Given limited policing and crime reduction resources, it is often crucial to identify effective strategies to deploy the available resources. Towards this goal, crime hotspot prediction has previously been suggested. Crime hotspot prediction leverages past data in order to identify geographical areas susceptible of hosting crimes in the future. However, most of the existing techniques in crime hotspot prediction solely use historical crime records to identify crime hotspots, while ignoring the predictive power of other data such as urban or social media data. In this paper, we propose CrimeTelescope, a platform that predicts and visualizes crime hotspots based on a fusion of different data types. Our platform continuously collects crime data as well as urban and social media data on the Web. It then extracts key features from the collected data based on both statistical and linguistic analysis. Finally, it identifies crime hotspots by leveraging the extracted features, and offers visualizations of the hotspots on an interactive map. Based on real-world data collected from New York City, we show that combining different types of data can effectively improve the crime hotspot prediction accuracy (by up to 5.2%), compared to classical approaches based on historical crime records only. In addition, we demonstrate the usability of our platform through a System Usability Scale (SUS) survey on a full prototype of CrimeTelescope.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Figure 1
Figure 2
Figure 4
Figure 5
Figure 3
Figure 6
Figure 7
Figure 8
Figure 9
Figure 10
Figure 11
Figure 12
Figure 13

Similar content being viewed by others

Notes

  1. https://nycopendata.socrata.com/

  2. http://stamen.com/work/crimespotting/

  3. https://www.spotcrime.com/

  4. http://www.crimemapping.com/

  5. https://www.census.gov/ces/dataproducts/demographicdata.html

  6. http://oauth.net/

  7. https://developers.google.com/maps/

  8. https://nycopendata.socrata.com/

  9. https://foursquare.com/

  10. https://developer.foursquare.com/categorytree

  11. https://twitter.com/

  12. https://dev.twitter.com/streaming/public

  13. Note that the self-defined stop words are iteratively selected. More precisely, analyzing the LDA results allows us to detect frequent words that do not add any meaning to the documents and include them in the stop word list for another round of LDA training.

  14. 14 http://prediction.heaney.ch/

  15. https://docs.google.com/forms

References

  1. Arulanandam, R, Savarimuthu, BTR, Purvis, MA: Extracting crime information from online newspaper articles. In: Proceedings of the second australasian Web conference, pp 31–38. Australian Computer Society, Inc., Sydney (2014)

  2. Bangor, A, Kortum, P, Miller, J: Determining what individual sus scores mean: adding an adjective rating scale. J. Usability Stud. 4(3), 114–123 (2009)

    Google Scholar 

  3. Bird, S, Klein, E, Loper, E: Natural language processing with Python: analyzing text with the natural language toolkit. O’Reilly Media, Inc., Sebastopol (2009)

    MATH  Google Scholar 

  4. Blei, DM, Ng, AY, Jordan, MI: Latent dirichlet allocation. J. Mach. Learn. Res. 3(Jan), 993–1022 (2003)

    MATH  Google Scholar 

  5. Block, RL, Block, CR: Space, place and crime: hot spot areas and hot places of liquor-related crime. Crime and place 4(2), 145–184 (1995)

    MathSciNet  Google Scholar 

  6. Bogomolov, A, Lepri, B, Staiano, J, Oliver, N, Pianesi, F, Pentland, A: Once upon a crime: towards crime prediction from demographics and mobile data. In: Proceedings of the 16th international conference on multimodal interaction, pp 427–434. ACM, New York (2014)

  7. Braga, AA, Hureau, DM, Papachristos, AV: An ex post facto evaluation framework for place-based police interventions. Eval. Rev. 35(6), 592–626 (2011)

    Article  Google Scholar 

  8. Brantingham, PJ, Brantingham, PL: Environmental criminology. Sage Publications, Beverly Hills (1981)

    Google Scholar 

  9. Brantingham, PL, Brantingham, PJ: A theoretical model of crime hot spot generation. Studies on Crime & Crime Prevention (1999)

  10. Brooke, J: Sus-a quick and dirty usability scale. Usability Evaluation in Industry 189, 194 (1996)

    Google Scholar 

  11. Chainey, S, Tompson, L, Uhlig, S: The utility of hotspot mapping for predicting spatial patterns of crime. Secur. J. 21(1), 4–28 (2008)

    Article  Google Scholar 

  12. Ehrlich, I: On the relation between education and crime. In: Education, income, and human behavior, pp 313–338. NBER, Massachusetts (1975)

  13. Eysenck, HJ: Crime and personality (Psychology Revivals). Routledge, Abingdon (2013)

    Google Scholar 

  14. Ferré, S, Hermann, A: Semantic search: reconciling expressive querying and exploratory search. In: International semantic Web conference, pp 177–192. Springer, Berlin (2011)

  15. Garland, D: Governmentality’and the problem of crime: foucault, criminology, sociology. Theor. Criminol. 1(2), 173–214 (1997)

    Article  Google Scholar 

  16. Gerber, MS: Predicting crime using twitter and kernel density estimation. Decis. Support. Syst. 61, 115–125 (2014)

    Article  Google Scholar 

  17. Go, A, Bhayani, R, Huang, L: Twitter sentiment classification using distant supervision. CS224n Project Report, Stanford 1.2009, 12 (2009)

  18. Huang, YY, Li, CT, Jeng, SK: Mining location-based social networks for criminal activity prediction. In: Wireless and optical communication conference (WOCC), 2015 24th, pp 185–189. IEEE, Piscataway (2015)

  19. Kitchin, R: The real-time city? big data and smart urbanism. GeoJournal 79(1), 1–14 (2014)

    Article  Google Scholar 

  20. Lewis, JR, Sauro, J: The factor structure of the system usability scale . In: Human centered design, pp 94–103. Springer, Berlin (2009)

  21. Likert, R: A technique for the measurement of attitudes. Arch. Psychol. 22(140), 1–55 (1932)

    Google Scholar 

  22. Ling, CX, Huang, J, Zhang, H: Auc: a better measure than accuracy in comparing learning algorithms. In: Conference of the canadian society for computational studies of intelligence, pp 329–341. Springer, Berlin (2003)

  23. Lynch, AK, Rasmussen, DW: Measuring the impact of crime on house prices. Appl. Econ. 33(15), 1981–1989 (2001)

    Article  Google Scholar 

  24. Mukherjee, S et al.: Ethnicity and crime. Trends Iss. Crime Crim. Justice 117, 1 (1999)

  25. Newton, A, Felson, M: Crime patterns in time and space: the dynamics of crime opportunities in urban areas. Crime Sci. 4(1), 1–5 (2015)

    Article  Google Scholar 

  26. Raphael, S, Winter-Ebmer, R: Identifying the effect of unemployment on crime. J. Law Econ. 44(1), 259–283 (2001)

    Article  Google Scholar 

  27. Ratcliffe, JH, Taniguchi, T, Groff, ER, Wood, JD: The philadelphia foot patrol experiment: a randomized controlled trial of police patrol effectiveness in violent crime hotspots. Criminology 49(3), 795–831 (2011)

    Article  Google Scholar 

  28. Scott, DW: Multivariate density estimation: theory, practice, and visualization. Wiley, Hoboken (2015)

    Book  MATH  Google Scholar 

  29. Shuyo, N: Language detection library for java. http://code.google.com/p/language-detection/ (2010)

  30. Signorini, A, Segre, AM, Polgreen, PM: The use of twitter to track levels of disease activity and public concern in the us during the influenza a h1n1 pandemic. PloS one 6(5), e19,467 (2011)

    Article  Google Scholar 

  31. Sun, Y, Wong, AK, Kamel, MS: Classification of imbalanced data: a review. Int. J. Pattern Recognit. Artif. Intell. 23(04), 687–719 (2009)

    Article  Google Scholar 

  32. Taylor, B, Koper, CS, Woods, DJ: A randomized controlled trial of different policing strategies at hot spots of violent crime. J. Exp. Criminol. 7(2), 149–181 (2011)

    Article  Google Scholar 

  33. Toole, JL, Eagle, N, Plotkin, JB: Spatiotemporal correlations in criminal offense records. ACM Trans. Intell. Syst. Technol. 2(4), 38 (2011)

    Article  Google Scholar 

  34. Trafton, J, Martins, S, Michel, M, Lewis, E, Wang, D, Combs, A, Scates, N, Tu, S, Goldstein, MK: Evaluation of the acceptability and usability of a decision support system to encourage safe and effective use of opioid therapy for chronic, noncancer pain by primary care providers. Pain Med. 11(4), 575–585 (2010)

    Article  Google Scholar 

  35. Traunmueller, M, Quattrone, G, Capra, L: Mining mobile phone data to investigate urban crime theories at scale. In: Social informatics, pp 396–411. Berlin, Springer (2014)

  36. Tumasjan, A, Sprenger, TO, Sandner, PG, Welpe, IM: Election forecasts with twitter: How 140 characters reflect the political landscape. Social Sci. Comput. Rev. 29 (4), 402–418 (2011)

    Article  Google Scholar 

  37. Vivacqua, AS, Borges, MR: Taking advantage of collective knowledge in emergency response systems. J. Netw. Comput. Appl. 35(1), 189–198 (2012)

    Article  Google Scholar 

  38. Wang, H, Kifer, D, Graif, C, Li, Z: Crime rate inference with big data. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp 635–644. ACM, New York (2016)

  39. Wang, T, Rudin, C, Wagner, D, Sevieri, R: Learning to detect patterns of crime. In: Machine learning and knowledge discovery in databases, pp 515–530. Springer, Berlin (2013)

  40. Wang, T, Rudin, C, Wagner, D, Sevieri, R: Finding patterns with a rotten core: data mining for crime series with cores. Big Data 3(1), 3–21 (2015)

    Article  Google Scholar 

  41. Wang, X, Gerber, MS, Brown, DE: Automatic crime prediction using events extracted from twitter posts. In: International conference on social computing, behavioral-cultural modeling, and prediction, pp 231–238. Springer, Berlin (2012)

  42. Yang, D, Zhang, D, Yu, Z, Wang, Z: A sentiment-enhanced personalized location recommendation system. In: Proceedings of the 24th ACM conference on hypertext and social media, pp 119–128 (2013)

  43. Yang, D, Zhang, D, Frank, K, Robertson, P, Jennings, E, Roddy, M, Lichtenstern, M: Providing real-time assistance in disaster relief by leveraging crowdsourcing power. Pers. Ubiquit. Comput. 18(8), 2025–2034 (2014)

    Article  Google Scholar 

  44. Yang, D, Zhang, D, Chen, L, Qu, B: Nationtelescope: monitoring and visualizing large-scale collective behavior in lbsns. J. Netw. Comput. Appl. 55, 170–180 (2015)

    Article  Google Scholar 

  45. Yang, D, Zhang, D, Qu, B: Participatory cultural mapping based on collective behavior data in location-based social networks. ACM Trans. Intell. Syst. Technol. (TIST) 7(3), 30 (2016)

    Google Scholar 

  46. Yuan, J, Zheng, Y, Xie, X: Discovering regions of different functions in a city using human mobility and pois. In: Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining, pp 186–194. ACM, New York (2012)

  47. Zhong, Y., Yuan, NJ, Zhong, W, Zhang, F, Xie, X: You are where you go: Inferring demographic attributes from location check-ins. In: Proceedings of the 8th ACM international conference on Web search and data mining, pp 295–304. ACM, New York (2015)

Download references

Acknowledgments

This project has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement 683253/GraphInt).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dingqi Yang.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yang, D., Heaney, T., Tonon, A. et al. CrimeTelescope: crime hotspot prediction based on urban and social media data fusion. World Wide Web 21, 1323–1347 (2018). https://doi.org/10.1007/s11280-017-0515-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11280-017-0515-4

Keywords

Navigation