Abstract
Objectives
We illustrate how a machine learning algorithm, Random Forests, can provide accurate long-term predictions of crime at micro places relative to other popular techniques. We also show how recent advances in model summaries can help to open the ‘black box’ of Random Forests, considerably improving their interpretability.
Methods
We generate long-term crime forecasts for robberies in Dallas at 200 by 200 feet grid cells that allow spatially varying associations of crime generators and demographic factors across the study area. We then show how using interpretable model summaries facilitate understanding the model’s inner workings.
Results
We find that Random Forests greatly outperform Risk Terrain Models and Kernel Density Estimation in terms of forecasting future crimes using different measures of predictive accuracy, but only slightly outperform using prior counts of crime. We find different factors that predict crime are highly non-linear and vary over space.
Conclusions
We show how using black-box machine learning models can provide accurate micro placed based crime predictions, but still be interpreted in a manner that fosters understanding of why a place is predicted to be risky.
Data Availability
Data and code to replicate the results can be downloaded from https://www.dropbox.com/sh/b3n9a6z5xw14rd6/AAAjqnoMVKjzNQnWP9eu7M1ra?dl=0.
Notes
For more discussion about self-exciting point processes, see the full issue of Statistical Science volume 33(3) with several commentaries on Reinhart (2018) as well as a rejoinder responding to these commentaries.
Why a grid cell is estimated to be high risk should not imply that the highlighted covariates are then a causal explanation for crime. The identified land use features correlate with crime and hopefully predict better than chance where future crime will occur, but one should not infer a causal mechanism.
Even these are not exclusive of how one may measure the impact of a crime generator. Another common approach is to estimate the number of generators nearby, where nearby is either a specific buffer distance, or determined via adjacency of some aggregate unit (Bernasco and Block 2011; Haberman and Ratcliffe 2015; Murray and Roncek 2008; Wheeler 2019b).
Thus a clear difference of predictive models as compared to explanatory models is that the goal is to predict (out-of-sample) crime. That is, the main goal is not to derive causal hypotheses from a causal theoretical model and test these using empirical data. Instead, the goal of predictive models is to predict new or future observations (e.g. point estimates, interval predictions, or rankings).
The most accurate forecasting models are often models that combine different machine learning methods that together predict crime. Using the same logic, our paper is not antagonistic of self-exciting point process models (Mohler et al. 2011). Instead, we see the value of using such methods to predict short-to-medium crime occurrence, while our models can be used to predict medium-to-long term crime trends and also allow for crime predictions in ‘what-if’ scenarios.
Given these crime generator variables come from various sources, they are not uniformly prior to the crime counts used in the research. While the street index database is based on 2014 data, many of the other factors are based on more recent data collections. We do not believe it represents a large threat to the findings though, as many of the factors are historically stable due to long standing zoning laws in Dallas (Fischel 2015).
Simpson’s Index for a block group equals \(\left( {p_{\text{white}} } \right)^{2} + \left( {p_{\text{black}} } \right)^{2} + \left( {p_{\text{Hispanic}} } \right)^{2}\) where p represents the proportion of that particular racial group.
The fixed L2 costs are one aspect we are not able to replicate given public descriptions. We considered two L2 costs of 1 and 5 in this step. To determine the best L1 penalty we use cross-validation, consistent with the code provided in Kennedy et al. (2016).
We conduct RTM analysis that both excludes demographic factors as well includes demographic factors. The models that included demographic factors were much more accurate than those that did not along all of the metrics we evaluate, so we only report the RTM model that includes demographic factors into the model selection process. “Appendix 2” lists the final produced RTM model.
The default settings in the ranger implementation are used, i.e. the number of trees is set to 500, and resampling is done with replacement.
Several of the sawtooth or flat line patterns are due to how we treat prediction ties in Fig. 1, most noticeable for the plot in the top right and the plot in the bottom left. Here we present the worst case scenario, sorting the predictions in descending order, and the future crimes in ascending order. This is typically how ROC plots are displayed (Davis and Goadrich 2006), and so it is similarly appropriate for these metrics as well.
To calculate the expected number of crimes per the kernel density estimate we multiplied the intensity kernel density value by the area of the grid cell, thus getting an expected number of crimes per grid cell.
Because of the size of the dataset and number of variables, to calculate this we selected a stratified sample of 2000 cases within 10 strata (so overall 20,000 cases). The strata were defined by the crime counts in the cells, with ties broken by the predicted number of crimes.
The correct predictive interpretation of a regression coefficient is a comparison between grid cells: how much does crime differ, on average, when comparing two groups of grid cells that differ by 1 in the relevant predictor while being identical in all the other predictors. Only under stringent assumptions can one make a counterfactual interpretation of changes within grid cells: the expected change in crime in that grid cell caused by adding 1 to the relevant predictor, while leaving all the other predictors in the model unchanged (see Gelman and Hill 2006, p. 34). With perhaps a few exceptions, all studies cited in this paper are of the former type.
For example, a location may have a predicted 5 crimes in the area: the distance to the nearest liquor store contributes 3.6 to that prediction, while the proportion in poverty contributes 1.3, and the rest of the predictor variables combine for the additional 0.1 predicted crimes (including factors that potentially decrease crime).
We have provided supplementary random forest models to illustrate this confounding in “Appendix 3”. One is a model only using the exact same variables as the RTM model (the binary indicators and the demographic factors), and another using only the XY coordinates of the grid cell. While the model presented in the paper outperforms either, random forests trained on those different subsets of data still provide excellent future predictions.
References
Andresen MA (2006) Crime measures and the spatial analysis of criminal activity. Brit J Criminol 46:258–285
Andresen MA, Jenion GW (2008) Crime prevention and the science of where people are. Crim Justice Policy Rev 19:164–180
Andresen MA, Curman AS, Linning SJ (2017) The trajectories of crime at places: understanding the patterns of disaggregated crime types. J Quant Criminol 33:427–449
Apley DW (2016) Visualizing the effects of predictor variables in black box supervised learning models. arXiv:1612.08468
Barnum JD, Caplan JM, Kennedy LW, Piza EL (2017) The crime kaleidoscope: a cross-jurisdictional analysis of place features and crime in three urban environments. Appl Geogr 79:203–211
Berk R (2008) Forecasting methods in crime and justice. Ann Rev Law Soc Sci 4:219–238
Berk R (2010) What you can and can’t properly do with regression. J Quant Criminol 26:481–487
Berk R (2013) Algorithmic criminology. Secur Inform 2(1):5
Berk R, Bleich J (2013) Statistical procedures for forecasting criminal behavior. Criminol Public Policy 12:513–544
Bernasco W, Block RL (2011) Robberies in Chicago: a block-level analysis of the influence of crime generators, crime attractors, and offender anchor points. J Res Crime Delinq 48:33–57
Bien J, Taylor J, Tibshirani R (2013) A lasso for hierarchical interactions. Ann Stat 41:1111–1141
Block RL, Block CR (1995) Space, place and crime: hot spot areas and hot places of liquor-related crime. Crime Prev Stud 4:145–184
Boessen A, Hipp JR (2018) Parks as crime inhibitors or generators: examining parks and the role of their nearby context. Soc Sci Res 76:186–201
Boivin R, Felson M (2018) Crimes by visitors versus crimes by residents: the influence of visitor inflows. J Quant Criminol 34:465–480
Bowers KJ, Johnson SD, Pease K (2004) Prospective hot-spotting: the future of crime mapping? Br J Criminol 44:641–658
Braga AA, Weisburd DL, Waring EL, Mazerolle LG, Spelman W, Gajewski F (1999) Problem-oriented policing in violent crime places: a randomized controlled experiment. Criminology 37:541–580
Braga AA, Papachristos AV, Hureau DM (2010) The concentration and stability of gun violence at micro places in Boston, 1980–2008. J Quant Criminol 26:33–53
Braga AA, Papachristos AV, Hureau DM (2014) The effects of hot spots policing on crime: an updated systematic review and meta-analysis. Justice Q 31:633–663
Brantingham PL, Brantingham PJ (1993) Nodes, paths and edges: considerations on the complexity of crime and the physical environment. J Environ Pyschol 13:3–28
Brayne S (2017) Big data surveillance: the case of policing. Am Soc Rev 82:977–1008
Breiman L (2000) Randomizing outputs to increase prediction accuracy. Mach Learn 40(3):229–242
Breiman L (2001a) Random forests. Mach Learn 45:5–32
Breiman L (2001b) Statistical modeling: the two cultures. Stat Sci 16:199–231
Brenning A (2012) Spatial cross-validation and bootstrap for the assessment of prediction rules in remote sensing: the R package sperrortest. In: IEEE international geoscience and remote sensing symposium, July
Bushway SD (2013) Is there any logic to using logit? Finding the right tool for the increasingly important job of risk prediction. Criminol Public Policy 12:563–567
Cahill M, Mulligan G (2007) Using geographically weighted regression to explore local crime patterns. Soc Sci Comput Rev 25:174–193
Caplan JM, Kennedy LW (2011) Risk terrain modelling: brokering criminological theory and GIS methods for crime forecasting. Justice Q 28:360–381
Caplan JM, Kennedy LW, Piza EL (2013a) Joint utility of event-dependent and environmental crime analysis techniques for violent crime forecasting. Crime Delinq 59:243–270
Caplan JM, Kennedy LW, Piza EL, with a chapter contribution by Heffner J (2013b) Risk terrain modeling diagnostics utility USER MANUAL. Rutgers Center on Public Security. http://www.rutgerscps.org/uploads/2/7/3/7/27370595/rtmdxusermanual_final_caplankennedypiza.pdf. Accessed 4 Oct 2019
Chainey S, Thompson L, Uhlig S (2008) The utility of hotspot mapping for predicting spatial patterns of crime. Secur J 21:4–28
Chohlas-Wood A, Levine ES (2019) A recommendation engine to aid in identifying crime patterns. Inf J Appl Anal 49:154–166
Clarke RV, Bichler-Robertson G (1998) Place managers, slumlords, and crime in low rent apartment buildings. Secur J 11:11–19
Cohen LE, Felson M (1979) Social change and crime rate trends: a routine activity approach. Am Sociol Rev 44:588–608
Cortright J, Mahmoudi D (2016) City report: the storefront index. City Observatory. http://cityobservatory.org/wp-content/uploads/2016/04/Storefront_Index_April_2016.pdf. Accessed 3 May 2019
Curman AS, Andresen MA, Brantingham PJ (2014) Crime and place: a longitudinal examination of street segment patterns in Vancouver, BC. J Quant Criminol 31:127–147
Davis J, Goadrich M (2006) The relationship between precision-recall and ROC curves. In: Proceedings of the 23rd international conference on machine learning
Deryol R, Wilcox P, Logan M, Wooldredge J (2016) Crime places in context: an illustration of the multilevel nature of hotspot development. J Quant Criminol 32:305–325
Drawve G (2016) A metric comparison of predictive hot spot techniques and RTM. Justice Q 33:369–397
Drawve G, Barnum JD (2015) Place-based risk factors for aggravated assault across police divisions in Little Rock, Arkansas. J Crime Justice 41:173–192
Drawve G, Wooditch A (2019) A research note on the methodological and theoretical considerations for assessing crime forecasting accuracy with the predictive accuracy index. J Crim Justice 64:101625
Drawve G, Moak SC, Berthelot ER (2016a) Predictability of gun crimes: a comparison of hot spot and risk terrain modelling techniques. Police Soc 26:312–331
Drawve G, Thomas SA, Walker JT (2016b) Bringing the physical environment back into neighborhood research: the utility of RTM for developing an aggregate neighborhood risk of crime measure. J Crim Justice 44:21–29
Eck J, Chainey S, Cameron J, Wilson R (2005) Mapping crime: understanding hotspots. National Institute of Justice. Washington, D.C.
Eck JE, Clarke RV, Guerette RT (2007) Risky facilities: crime concentration in homogenous setes of establishments and facilities. Crime Prev Stud 21:225–264
Efron B, Hastie T (2016) Computer age statistical inference, vol 5. Cambridge University Press, Cambridge
Ferguson AG (2017) The rise of big data policing: surveillance, race, and the future of law enforcement. New York University Press, New York
Fischel WA (2015) Zoning rules!: the economics of land use regulation. Lincoln Institute of Land Policy, Massachusetts
Fotheringham AS, Brunsdon C, Charlton M (2003) Geographically weighted regression: the analysis of spatially varying relationships. Wiley, West Sussex
Friedman JH, Popescu BE (2008) Predictive learning via rule ensembles. Ann Appl Stat 2:916–954
Garnier S, Caplan JM, Kennedy LW (2018) Predicting dynamical crime distribution from environmental and social influences. Front Appl Math Stat 4:13
Gelman A, Hill J (2006) Data analysis using regression and multilevel/hierarchical models. Cambridge University Press, Cambridge
Gelman A, Pardoe I (2007) Average predictive comparisons for models with nonlinearity, interactions, and variance components. Sociol Methodol 37:23–51
Gerell M (2018) Bus stops and violence, are risky places really risky? Eur J Crim Policy Res 24:351–371
Graif C, Sampson RJ (2009) Spatial heterogeneity in the effects of immigration and diversity on neighborhood homicide rates. Homicide Stud 13:242–260
Groff ER (2014) Quantifying the exposure of street segments to drinking places nearby. J Quant Criminol 30:527–548
Groff ER, La Vigne NG (2001) Mapping an opportunity surface of residential burglary. J Res Crime Delinq 38:257–278
Groff ER, La Vigne NG (2002) Forecasting the future of predictive crime mapping. Crime Prev Stud 13:29–57
Groff ER, Lockwood B (2014) Criminogenic facilities and crime across street segments in Philadelphia: uncovering evidence about the spatial extent of facility influence. J Res Crime Delinq 51:277–314
Groff ER, Ratcliffe JH, Haberman CP, Sorg ET, Joyce NM, Taylor RB (2015) Does what police do at hot spots matter? The Philadelphia policing tactics experiment. Criminology 53:23–53
Guerette RT, Bowers KJ (2009) Assessing the extent of crime displacement and diffusion of benefits: a review of situational crime prevention evaluations. Criminology 47:1331–1368
Haberman CP (2017) Overlapping hot spots? Examination of the spatial heterogeneity of hot spots of different crime types. Criminol Public Policy 16:633–660
Haberman CP, Ratcliffe JH (2015) Testing for temporally differentiated relationships among potentially criminogenic places and census block street robbery counts. Criminology 53:457–483
Haberman CP, Groff ER, Taylor RB (2013) The variable impacts of public housing community proximity on nearby street robberies. J Res Crime Delinq 50:163–188
Harcourt BE (2007) Against prediction: profiling, policing, and punishing in an actuarial age. University of Chicago Press, Chicago
Hastie T, Tibshirani R, Friedman J (2009) The elements of statistical learning: prediction, inference and data mining. Springer, New York
Hipp JR (2016) General theory of spatial crime patterns. Criminology 54:653–679
Hipp JR, Boessen A (2013) Egohoods as waves washing across the city: a new measure of “neighborhoods”. Criminology 51:287–327
Hipp JR, Kim Y (2017) Measuring crime concentration across cities of varying sizes: complications based on the spatial and temporal scale employed. J Quant Criminol 33:595–632
Hipp JR, Steenbeek W (2016) Types of crime and types of mechanisms: what are the consequences for neighborhoods over time? Crime Delinq 62(9):1203–1234
Hipp JR, Yates DK (2011) Ghettos, thresholds, and crime: does concentrated poverty really have an accelerating increasing effect on crime? Criminology 49:955–990
Hipp JR, Kane K, Kim JH (2017) Recipes for neighborhood development: a machine learning approach toward understanding the impact of mixing in neighborhoods. Landsc Urban Plan 164:1–12
Hipp JR, Bates C, Lichman M, Smyth P (2019) Using social media to measure temporal ambient population: does it help explain local crime rates? Justice Q 36:718–748
Hunt J (2016) Do crime hot spots move? Exploring the effects of the modifiable areal unit problem and modifiable temporal unit problem on crime hot spot stability. Dissertation, American University
Johnson SD (2014) How do offenders choose where to offend? Perspectives from animal foraging. Legal Criminol Psychol 19:193–210
Johnson SD, Summers L, Pease K (2009) Offender as forager? A direct test of the boost account of victimization. J Quant Criminol 25:181–200
Jones RW, Pridemore WA (2019) Toward an integrated multilevel theory of crime at place: routine activities, social disorganization, and the law of crime concentration. J Quant Criminol 35:543–572
Kelly M (2019) The standard errors of persistence. SSRN. https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3398303. Accessed 4 Oct 2019
Kennedy LW, Caplan JM, Piza E (2010) Risk clusters, hotspots, and spatial intelligence: risk terrain modelling as an algorithm for police resource allocation. J Quant Criminol 27:339–362
Kennedy LW, Caplan JM, Piza E, Buccine-Schraeder H (2016) Vulnerability and exposure to crime: applying risk terrain modeling to the study of assault in Chicago. Appl Spat Anal Policy 9:529–548
Kim YA (2018) Examining the relationship between the structural characteristics of place and crime by imputing census block data in street segments: is the pain worth the gain? J Quant Criminol 34:67–110
Kim YA, Hipp JR (2020) Street egohood: an alternative perspective of measuring neighborhood and spatial patterns of crime. J Quant Criminol 36:29–66
Kochel TR (2011) Constructing hot spots policing: unexamined consequences for disadvantaged populations and for police legitimacy. Crim Justice Policy Rev 22:350–374
Kubrin CE, Hipp JR (2016) Do fringe banks create fringe neighborhoods? Examining the spatial relationship between fringe banking and neighborhood crime rates. Justice Q 33:755–784
Kuhn M, Johnson K (2013) Applied predictive modeling, vol 26. Springer, New York
Kurland J, Johnson SD, Tilley N (2014) Offenses around stadiums: a natural experiment on crime attraction and generation. J Res Crime Delinq 51:5–28
Lee YJ, O SH, Eck JE (2019) A theory driven algorithms for real-time crime hot spot forecasting. Police Q
Levin A, Rosenfeld R, Deckard M (2017) The law of crime concentration: an application and recommendations for future research. J Quant Criminol 33:635–647
Levine N (2008) The “Hottest” part of a hotspot: commentary on “The utility of hotspot mapping for predicting spatial patterns of crime”. Secur J 21:295–302
Light MT, Harris CT (2012) Race, space, and violence: exploring the spatial dependence in structural covariates of white and black violent crime in US counties. J Quant Criminol 28:559–586
Loftin C (1986) Assaultive violence as a contagious social process. Bull N Y Acad Med 62:550–555
Macbeth E, Ariel B (2019) Place-based statistical versus clinical predictions of crime hot spots and harm locations in Northern Ireland. Justice Q 36:93–126
Malleson N, Andresen MA (2015) The impact of using social media data in crime rate calculations: shifting hot spots and changing spatial patterns. Cartogr Geogr Inf Sci 42:112–121
Massey DS, Denton NA (1988) The dimensions of residential segregation. Soc Forces 67:281–315
McCarty WP, Hepworth DP (2013) Mobile home parks and crime: does proximity matter? J Crime Justice 36:319–333
Miles JN, Weden MW, Lavery D, Escarce JJ, Cagney KA, Shih RA (2016) Constructing a time-invariant measure of the socio-economic status of U.S. census tracts. J Urban Health 93:213–232
Mohler G (2014) Marked point process hotspot maps for homicide and gun crime prediction in Chicago. Int J Forecast 30(3):491–497
Mohler GO, Porter MD (2018) Rotational grid, PAI-maximizing crime forecasts. Stat Anal Data Min 11:227–236
Mohler GO, Short MB, Brantingham PJ, Schoenberg FP, Tita GE (2011) Self-exciting point process modeling of crime. J Am Stat Assoc 106:100–108
Mohler GO, Short MB, Malinowski S, Johnson M, Tita GE, Bertozzi AL, Brantingham PJ (2015) Randomized controlled field trials of predictive policing. J Am Stat Assoc 110:1399–1411
Molnar C (2018) Interpretable machine learning: a guide for making black box models explainable. https://christophm.github.io/interpretable-ml-book/. Accessed 4 Oct 2019
Moreto WD, Piza EL, Caplan JM (2014) “A plague on both your houses?”: risks, repeats and reconsiderations of urban residential burglary. Justice Q 31:1102–1126
Murray RK, Roncek DW (2008) Measuring diffusion of assaults around bars through radius and adjacency techniques. Crim Justice Rev 33:199–220
Ohyama T, Amemiya M (2018) Applying crime prediction techniques to Japan: a comparison between risk terrain modeling and other methods. Eur J Crim Policy Res 24:469–487
Perry WL, McInnis B, Price CC, Smith SC, Hollywood JS (2013) Predictive policing: the role of crime forecasting in law enforcement operations. Rand, Santa Monica
Piza E, Carter JG (2018) Predicting initiator and near repeat events in spatiotemporal crime patterns: an analysis of residential burglary and motor vehicle theft. Justice Q 35:842–870
Piza E, Feng S, Kennedy LW, Caplan JM (2017) Place-based correlates of motor vehicle theft and recovery: measuring spatial influence across neighborhood context. Urban Stud 54:2998–3021
Ratcliffe J (2012) The spatial extent of criminogenic places on the surrounding environment: a change-point regression of violence around bars. Geogr Anal 44:302–320
Ratcliffe JH, McCullagh MJ (2001) Chasing ghosts? Police perception of high crime areas. Br J Criminol 41:330–341
Ratcliffe JH, Rengert GF (2008) Near-repeat patterns in Philadelphia shootings. Secur J 21:58–76
Ratcliffe JH, Taylor RB, Perenzin A (2016) Predictive modeling combining short and long-term crime risk potential, Final report. U.S. Department of Justice, National Institute of Justice, Washington, D.C.
Ratcliffe JH, Taylor RB, Perenzin-Askey A, Thomas K, Grasso J, Bethel K, Fisher R, Koehnlein J (2020) The Philadelphia predictive policing experiment. J Exp Criminol
Reich B, Porter MD (2015) Partially supervised spatiotemporal clustering for burglary crime series identification. J R Stat Soc Ser A 178:465–480
Reinhart A (2016) Response to “Crime places in context”. J Quant Criminol 32:723–724
Reinhart A (2018) A review of self-exciting spatio-temporal point processes and their applications. Stat Sci 33(3):299–318
Reinhart A, Greenhouse J (2018) Self-exciting point processes with spatial covariates: modelling the dynamics of crime. J R Stat Soc Ser C (Appl Stat) 67:1305–1329
Ribeiro MT, Singh S, Guestrin C (2016) Why should I trust you?: Explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp 1135–1144
Ridgeway G (2013) The pitfalls of prediction. NIJ J 271:34–40
Ridgeway G (2018) Policing in the era of big data. Ann Rev Criminol 1:401–419
Roberts DR, Bahn V, Ciuti S, Boyce MS, Elith J, Guillera-Arroita G, Lahoz-Monfort JJ, Schroder B, Thuiller W, Warton DI, Wintle BA, Hartig F, Dormann CF (2017) Cross-validation strategies for data with temporal, spatial, hierarchical, or phylogenetic structure. Ecography 40:913–929
Rosser G, Davies T, Bowers KJ, Johnson SD, Cheng T (2017) Predictive crime mapping: arbitrary grids or street networks? J Quant Criminol 33:569–594
Rummens A, Hardyns W, Pauwels L (2017) The use of predictive analysis in spatiotemporal crime forecasting: building and testing a model in an urban context. Appl Geogr 86:255–261
Sampson RJ (2012) Great American City: Chicago and the enduring neighborhood effect. University of Chicago Press, Chicago
Schmid CF (1926) A study of homicides in Seattle, 1914 to 1924. Soc Forces 4:745–756
Shapley LS (1953) A value for n-person games. Contrib Theory Games 2:307–317
Shaw CR, McKay HD (1969) Juvenile delinquency and urban areas: a study of rates of delinquency in relation to differential characteristics of local communities in American Cities. Revised edition. University of Chicago Press, Chicago
Sherman LW, Gartin PR, Buerger ME (1989) Hot spots of predatory crime: routine activities and the criminology of place. Criminology 27:27–56
Shmueli G (2010) To explain or to predict? Statistical Science 25:289–310
Shmueli G, Koppius OR (2011) Predictive analytics in information systems research. MIS Q 35:553–572
Short MB, D’Orsogna MR, Brantingham PJ, Tita GE (2009) Measuring and modeling repeat and near-repeat burglary effects. J Quant Criminol 25:325–339
Smith WR, Frazee SG, Davison EL (2000) Furthering the integration of routine activity and social disorganization theories: small units of analysis and the study of street robbery as a diffusion process. Criminology 38:489–524
Song G, Bernasco W, Liu L, Xiao L, Zhou S, Liao W (2019) Crime feeds on legal activities: daily mobility flows help to explain thieves’ target location choices. J Quant Criminol 35:831–854
Steenbeek W, Weisburd D (2016) Where the action is in crime? An examination of variability of crime across different spatial units in The Hague, 2001–2009. J Quant Criminol 32(3):449–469
Strobl C, Boulesteix A, Zeileis A, Hothorn T (2007) Bias in random forest variable importance measures: illustrations, sources and a solution. BMC Bioinf 8(1):25
Stucky TD, Ottensman JR (2009) Land use and violent crime. Criminology 47:1223–1264
Stults BJ, Hasbrouck M (2015) The effect of commuting on city-level crime rates. J Quant Criminol 31:331–350
Taylor RB, Ratcliffe JH, Perenzin A (2015) Can we predict long-term community crime problems? The estimation of ecological continuity to model risk heterogeneity. J Res Crime Delinq 52:635–657
Uchida CD, Swatt ML (2013) Operation LASER and the effectiveness of hotspot patrol: a panel analysis. Police Q 16:287–304
Van Patten I, McKeldin-Conor J, Cox D (2009) A microspatial analysis of robbery: prospective hot spotting in a small city. Crime Mapp J Res Pract 1:7–32
Vandeviver C, Steenbeek W (2019) The (in)stability of residential burglary patterns on street segments: the case of Antwerp, Belgium 2005–2016. J Quant Criminol 35:111–133
Wachter S, Mittelstadt B, Russell C (2018) Counterfactual explanations without opening the black box: automated decisions and the GDPR. Harv J Law Technol 31:841–887
Wang K, Simandl JK, Porter MD, Graettinger AJ, Smith RK (2016) How the choice of safety performance function affects the identification of important crash prediction variables. Accid Anal Prev 88:1–8
Weisburd D (2015) The law of crime concentration and the criminology of place. Criminology 53:133–157
Weisburd DL, Telep CW (2014) Hot spots policing: what we know and what we need to know. J Contemp Crim Justice 30:200–220
Weisburd DL, Bushway SD, Lum C, Yang SM (2004) Trajectories of crime at places: a longitudinal study of street segments in the city of Seattle. Criminology 42:283–322
Weisburd DL, Telep CW, Lawton BA (2014) Could innovations in policing have contributed to the New York City crime drop even in a period of declining police strength?: the case of stop, question and frisk as a hot spots policing strategy. Justice Q 31:129–153
Wheeler AP (2019a) Allocating police resources while limiting racial inequality. Justice Q
Wheeler AP (2019b) Quantifying the local and spatial effects of alcohol outlets on crime. Crime Delinq 65:845–871
Wheeler DC, Waller LA (2009) Comparing spatially varying coefficient models: a case study examining violent crime rates and their relationships to alcohol outlets and illegal drug arrests. J Geogr Syst 11:1–22
Wheeler AP, Worden RE, McLean SJ (2016) Replicating group-based trajectory models of crime at micro places in Albany, NY. J Quant Criminol 32:589–612
Wheeler AP, Worden RE, Silver JR (2019) The accuracy of the violent offender identification directive tool to predict future gun violence. Crim Justice Behav 46:770–788
Wright MN, Ziegler A (2017) ranger: a fast implementation of random forests for high dimensional data in C ++ and R. J Stat Softw 77:1–17
Xu J, Griffiths E (2017) Shooting on the street: measuring the spatial influence of physical features on gun violence in a bounded street network. J Quant Criminol 33:237–253
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Commentary expressed within the manuscript do not represent the views of HMS.
Appendices
Appendix 1: Crime Generator Data Source and SIC Code Classifications
This appendix provides additional information on the source of the crime generator data, as well as the SIC codes associated with different business categories. Store Front Index point data for Dallas can be downloaded from https://github.com/dillonma/storefrontindex. Public data downloads are mostly taken from https://gis.dallascityhall.com/shapefileDownload.aspx or http://gis.dallascityhall.com/wwwgis/rest/services. Texas schools are taken from https://schoolsdata2-tea-texas.opendata.arcgis.com/. The only reason check cashing stores are taken from Reference USA as opposed to Lexis Nexis is my local library cut the service for Lexis Nexis. The storefront data are businesses as of 2015. The rest of datasets were collected in 2018.
Appendix 2: RTM Predicted Model
See Table 6.
Appendix 3: Additional Random Forest Model Performance Metrics
See Fig. 5.
Rights and permissions
About this article
Cite this article
Wheeler, A.P., Steenbeek, W. Mapping the Risk Terrain for Crime Using Machine Learning. J Quant Criminol 37, 445–480 (2021). https://doi.org/10.1007/s10940-020-09457-7
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10940-020-09457-7