Analysis of street crime predictors in web open data

  • Yihong ZhangEmail author
  • Panote Siriaraya
  • Yukiko Kawai
  • Adam Jatowt


Crime predictors have been sought after by governments and citizens alike for preventing or avoiding crimes. In this paper, we attempt to thoroughly analyze crime predictors from three Web open data sources: Google Street View (GSV), Twitter, and Foursquare, which provides visual, textual, and human behavioral data respectively. In contrast to existing works that attempt crime prediction at zip-code level or coarser granularity, we focus on street-level crime prediction. We transform data assigned to street-segments, and extract and determine strong predictors correlated with crime. Particularly, we are the first to discover visual clues on street outlooks that are predictive for crime. We focus on the city of San Francisco, and our extensive experiments show the effectiveness of predictors in a range of tests. We show that by analyzing and selecting strong predictors in Web open data, one could achieve significantly better crime prediction accuracy, comparing to traditional demographic data-based prediction.


Crime prediction Web open data Image and text analysis 



  1. Aghababaei, S., & Makrehchi, M. (2016). Mining social media content for crime prediction. In 2016 IEEE/WIC/ACM international conference on web intelligence (WI) (pp. 526–531): IEEE.Google Scholar
  2. Barker, M., Page, S.J., Meyer, D. (2002). Modeling tourism crime: the 2000 America’s cup. Annals of Tourism Research, 29(3), 762–782.CrossRefGoogle Scholar
  3. Blei, D.M., Ng, A.Y., Jordan, M.I. (2003). Latent dirichlet allocation. Journal of Machine Learning Research, 3(Jan), 993–1022.zbMATHGoogle Scholar
  4. Camacho-Collados, M., & Liberatore, F. (2015). A decision support system for predictive police patrolling. Decision Support Systems, 75, 25–37.CrossRefGoogle Scholar
  5. Chen, T., Borth, D., Darrell, T., Chang, S.F. (2014). Deepsentibank: visual sentiment concept classification with deep convolutional neural networks. arXiv:1410.8586.
  6. Chen, X., Cho, Y., Jang, S.Y. (2015). Crime prediction using twitter sentiment and weather. In Systems and information engineering design symposium (SIEDS), 2015 (pp. 63–68): IEEE.Google Scholar
  7. De Nadai, M., Vieriu, R.L., Zen, G., Dragicevic, S., Naik, N., Caraviello, M., Hidalgo, C.A., Sebe, N., Lepri, B. (2016). Are safer looking neighborhoods more lively?: a multimodal investigation into urban life. In Proceedings of the international multimedia conference (pp. 1127–1135).Google Scholar
  8. Diebold, F.X., & Mariano, R.S. (2002). Comparing predictive accuracy. Journal of Business & Economic Statistics, 20(1), 134–144.MathSciNetCrossRefGoogle Scholar
  9. Du, B., Liu, C., Zhou, W., Hou, Z., Xiong, H. (2016). Catch me if you can: detecting pickpocket suspects from large-scale transit records. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining (pp. 87–96): ACM.Google Scholar
  10. Eck, J.E., Clarke, R.V., Guerette, R.T. (2007). Risky facilities: crime concentration in homogeneous sets of establishments and facilities. Crime Prevention Studies, 21, 225.Google Scholar
  11. Gerber, M.S. (2014). Predicting crime using twitter and kernel density estimation. Decision Support Systems, 61, 115–125.CrossRefGoogle Scholar
  12. Gill, C., Wooditch, A., Weisburd, D. (2017). Testing the law of crime concentration at place in a suburban setting: implications for research and practice. Journal of Quantitative Criminology, 33(3), 519– 545.CrossRefGoogle Scholar
  13. Graif, C., Gladfelter, A.S., Matthews, S.A. (2014). Urban poverty and neighborhood effects on crime: incorporating spatial and network perspectives. Sociology Compass, 8(9), 1140–1155.CrossRefGoogle Scholar
  14. Haklay, M., & Weber, P. (2008). OpenStreetMap: user-generated street maps. IEEE Pervasive Computing, 7(4), 12–18.CrossRefGoogle Scholar
  15. Kadar, C., Iria, J., Cvijikj, I.P. (2016). Exploring foursquare-derived features for crime prediction in new york city. In The international workshop on urban computing.Google Scholar
  16. Kang, H.W., & Kang, H.B. (2017). Prediction of crime occurrence from multi-modal data using deep learning. PloS one, 12(4), e0176244.MathSciNetCrossRefGoogle Scholar
  17. Khan, R., Van de Weijer, J., Khan, F.S., Muselet, D., Ducottet, C., Barat, C. (2013). Discriminative color descriptors. In 2013 IEEE conference on computer vision and pattern recognition (CVPR) (pp. 2866–2873): IEEE.Google Scholar
  18. Khosla, A., Das Sarma, A., Hamid, R. (2014). What makes an image popular?. In Proceedings of the international conference on world wide web (pp. 867–876): ACM.Google Scholar
  19. Kim, J., Cha, M., Sandholm, T. (2014). SocRoutes: safe routes based on tweet sentiments. In Proceedings of the international conference on world wide web (pp. 179–182): ACM.Google Scholar
  20. Krizhevsky, A., Sutskever, I., Hinton, G.E. (2012). Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems (pp. 1097–1105).Google Scholar
  21. Liao, R., Wang, X., Li, L., Qin, Z. (2010). A novel serial crime prediction model based on bayesian learning theory. In 2010 international conference on machine learning and cybernetics (ICMLC), (Vol. 4 pp. 1757–1762): IEEE.Google Scholar
  22. Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J. (2013). Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems (pp. 3111–3119).Google Scholar
  23. Ojala, T., Pietikainen, M., Maenpaa, T. (2002). Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(7), 971–987.CrossRefGoogle Scholar
  24. Oliva, A., & Torralba, A. (2001). Modeling the shape of the scene: a holistic representation of the spatial envelope. International Journal of Computer Vision, 42 (3), 145–175.CrossRefGoogle Scholar
  25. Peng, H., Long, F., Ding, C. (2005). Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(8), 1226–1238.CrossRefGoogle Scholar
  26. Pennington, J., Socher, R., Manning, C. (2014). Glove: global vectors for word representation. In Proceedings of the conference on empirical methods in natural language processing (pp. 1532–1543).Google Scholar
  27. Ristea, A., Kurland, J., Resch, B., Leitner, M., Langford, C. (2018). Estimating the spatial distribution of crime events around a football stadium from georeferenced tweets. ISPRS International Journal of Geo-Information, 7(2), 43.CrossRefGoogle Scholar
  28. Smola, A.J., & Schölkopf, B. (2004). A tutorial on support vector regression. Statistics and Computing, 14(3), 199–222.MathSciNetCrossRefGoogle Scholar
  29. Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z. (2016). Rethinking the inception architecture for computer vision. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 2818–2826).Google Scholar
  30. Taylor, R.B., Shumaker, S.A., Gottfredson, S.D. (1985). Neighborhood-level links between physical features and local sentiments: deterioration, fear of crime, and confidence. Journal of Architectural and Planning Research, 2(4), 261–275.Google Scholar
  31. Utamima, A., & Djunaidy, A. (2017). Be-safe travel, a web-based geographic application to explore safe-route in an area. In AIP conference proceedings, (Vol. 1867 p. 020023): AIP Publishing.Google Scholar
  32. Wang, H., Kifer, D., Graif, C., Li, Z. (2016). Crime rate inference with big data. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining (pp. 635–644): ACM.Google Scholar
  33. Wang, X., Brown, D.E., Gerber, M.S. (2012). Spatio-temporal modeling of criminal incidents using geographic, demographic, and twitter-derived information. In Proceedings of the IEEE international conference on intelligence and security informatics (pp. 36–41): IEEE.Google Scholar
  34. Weisburd, D. (2015). The law of crime concentration and the criminology of place. Criminology, 53(2), 133–157.CrossRefGoogle Scholar
  35. Wilson, J.Q., & Kelling, G.L. (1982). Broken windows. Atlantic Monthly, 249 (3), 29–38.Google Scholar
  36. Yang, D., Heaney, T., Tonon, A., Wang, L., Cudré-Mauroux, P. (2017). Crimetelescope: crime hotspot prediction based on urban and social media data fusion. World Wide Web: 1–25.Google Scholar
  37. Zhao, X., & Tang, J. (2017). Modeling temporal-spatial correlations for crime prediction. In Proceedings of the international conference on information and knowledge management (pp. 497–506).Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  1. 1.Department of Social Informatics, Graduate School of InformaticsKyoto UniversityKyotoJapan
  2. 2.Division of Frontier InformaticsKyoto Sangyo UniversityKyotoJapan

Personalised recommendations