Crime predictors have been sought after by governments and citizens alike for preventing or avoiding crimes. In this paper, we attempt to thoroughly analyze crime predictors from three Web open data sources: Google Street View (GSV), Twitter, and Foursquare, which provides visual, textual, and human behavioral data respectively. In contrast to existing works that attempt crime prediction at zip-code level or coarser granularity, we focus on street-level crime prediction. We transform data assigned to street-segments, and extract and determine strong predictors correlated with crime. Particularly, we are the first to discover visual clues on street outlooks that are predictive for crime. We focus on the city of San Francisco, and our extensive experiments show the effectiveness of predictors in a range of tests. We show that by analyzing and selecting strong predictors in Web open data, one could achieve significantly better crime prediction accuracy, comparing to traditional demographic data-based prediction.
This is a preview of subscription content, log in to check access.
Buy single article
Instant access to the full article PDF.
Price includes VAT for USA
Subscribe to journal
Immediate online access to all issues from 2019. Subscription will auto renew annually.
This is the net price. Taxes to be calculated in checkout.
The entire data used for the analysis is available upon request.
Aghababaei, S., & Makrehchi, M. (2016). Mining social media content for crime prediction. In 2016 IEEE/WIC/ACM international conference on web intelligence (WI) (pp. 526–531): IEEE.
Barker, M., Page, S.J., Meyer, D. (2002). Modeling tourism crime: the 2000 America’s cup. Annals of Tourism Research, 29(3), 762–782.
Blei, D.M., Ng, A.Y., Jordan, M.I. (2003). Latent dirichlet allocation. Journal of Machine Learning Research, 3(Jan), 993–1022.
Camacho-Collados, M., & Liberatore, F. (2015). A decision support system for predictive police patrolling. Decision Support Systems, 75, 25–37.
Chen, T., Borth, D., Darrell, T., Chang, S.F. (2014). Deepsentibank: visual sentiment concept classification with deep convolutional neural networks. arXiv:1410.8586.
Chen, X., Cho, Y., Jang, S.Y. (2015). Crime prediction using twitter sentiment and weather. In Systems and information engineering design symposium (SIEDS), 2015 (pp. 63–68): IEEE.
De Nadai, M., Vieriu, R.L., Zen, G., Dragicevic, S., Naik, N., Caraviello, M., Hidalgo, C.A., Sebe, N., Lepri, B. (2016). Are safer looking neighborhoods more lively?: a multimodal investigation into urban life. In Proceedings of the international multimedia conference (pp. 1127–1135).
Diebold, F.X., & Mariano, R.S. (2002). Comparing predictive accuracy. Journal of Business & Economic Statistics, 20(1), 134–144.
Du, B., Liu, C., Zhou, W., Hou, Z., Xiong, H. (2016). Catch me if you can: detecting pickpocket suspects from large-scale transit records. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining (pp. 87–96): ACM.
Eck, J.E., Clarke, R.V., Guerette, R.T. (2007). Risky facilities: crime concentration in homogeneous sets of establishments and facilities. Crime Prevention Studies, 21, 225.
Gerber, M.S. (2014). Predicting crime using twitter and kernel density estimation. Decision Support Systems, 61, 115–125.
Gill, C., Wooditch, A., Weisburd, D. (2017). Testing the law of crime concentration at place in a suburban setting: implications for research and practice. Journal of Quantitative Criminology, 33(3), 519– 545.
Graif, C., Gladfelter, A.S., Matthews, S.A. (2014). Urban poverty and neighborhood effects on crime: incorporating spatial and network perspectives. Sociology Compass, 8(9), 1140–1155.
Haklay, M., & Weber, P. (2008). OpenStreetMap: user-generated street maps. IEEE Pervasive Computing, 7(4), 12–18.
Kadar, C., Iria, J., Cvijikj, I.P. (2016). Exploring foursquare-derived features for crime prediction in new york city. In The international workshop on urban computing.
Kang, H.W., & Kang, H.B. (2017). Prediction of crime occurrence from multi-modal data using deep learning. PloS one, 12(4), e0176244.
Khan, R., Van de Weijer, J., Khan, F.S., Muselet, D., Ducottet, C., Barat, C. (2013). Discriminative color descriptors. In 2013 IEEE conference on computer vision and pattern recognition (CVPR) (pp. 2866–2873): IEEE.
Khosla, A., Das Sarma, A., Hamid, R. (2014). What makes an image popular?. In Proceedings of the international conference on world wide web (pp. 867–876): ACM.
Kim, J., Cha, M., Sandholm, T. (2014). SocRoutes: safe routes based on tweet sentiments. In Proceedings of the international conference on world wide web (pp. 179–182): ACM.
Krizhevsky, A., Sutskever, I., Hinton, G.E. (2012). Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems (pp. 1097–1105).
Liao, R., Wang, X., Li, L., Qin, Z. (2010). A novel serial crime prediction model based on bayesian learning theory. In 2010 international conference on machine learning and cybernetics (ICMLC), (Vol. 4 pp. 1757–1762): IEEE.
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J. (2013). Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems (pp. 3111–3119).
Ojala, T., Pietikainen, M., Maenpaa, T. (2002). Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(7), 971–987.
Oliva, A., & Torralba, A. (2001). Modeling the shape of the scene: a holistic representation of the spatial envelope. International Journal of Computer Vision, 42 (3), 145–175.
Peng, H., Long, F., Ding, C. (2005). Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(8), 1226–1238.
Pennington, J., Socher, R., Manning, C. (2014). Glove: global vectors for word representation. In Proceedings of the conference on empirical methods in natural language processing (pp. 1532–1543).
Ristea, A., Kurland, J., Resch, B., Leitner, M., Langford, C. (2018). Estimating the spatial distribution of crime events around a football stadium from georeferenced tweets. ISPRS International Journal of Geo-Information, 7(2), 43.
Smola, A.J., & Schölkopf, B. (2004). A tutorial on support vector regression. Statistics and Computing, 14(3), 199–222.
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z. (2016). Rethinking the inception architecture for computer vision. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 2818–2826).
Taylor, R.B., Shumaker, S.A., Gottfredson, S.D. (1985). Neighborhood-level links between physical features and local sentiments: deterioration, fear of crime, and confidence. Journal of Architectural and Planning Research, 2(4), 261–275.
Utamima, A., & Djunaidy, A. (2017). Be-safe travel, a web-based geographic application to explore safe-route in an area. In AIP conference proceedings, (Vol. 1867 p. 020023): AIP Publishing.
Wang, H., Kifer, D., Graif, C., Li, Z. (2016). Crime rate inference with big data. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining (pp. 635–644): ACM.
Wang, X., Brown, D.E., Gerber, M.S. (2012). Spatio-temporal modeling of criminal incidents using geographic, demographic, and twitter-derived information. In Proceedings of the IEEE international conference on intelligence and security informatics (pp. 36–41): IEEE.
Weisburd, D. (2015). The law of crime concentration and the criminology of place. Criminology, 53(2), 133–157.
Wilson, J.Q., & Kelling, G.L. (1982). Broken windows. Atlantic Monthly, 249 (3), 29–38.
Yang, D., Heaney, T., Tonon, A., Wang, L., Cudré-Mauroux, P. (2017). Crimetelescope: crime hotspot prediction based on urban and social media data fusion. World Wide Web: 1–25.
Zhao, X., & Tang, J. (2017). Modeling temporal-spatial correlations for crime prediction. In Proceedings of the international conference on information and knowledge management (pp. 497–506).
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Zhang, Y., Siriaraya, P., Kawai, Y. et al. Analysis of street crime predictors in web open data. J Intell Inf Syst 55, 535–559 (2020). https://doi.org/10.1007/s10844-019-00587-4
- Crime prediction
- Web open data
- Image and text analysis