For over three decades, hot spots policing (HSP) has been extensively tested by using “parallel track” comparisons between two (or more) groups of hot spots over long periods of time (90 to 365 days). The crime totals in hot spots receiving consistent HSP are compared to totals in similar hot spots not receiving HSP (Braga et al, 2019; Sherman & Weisburd, 1995).
In recent years, however, the parallel track trials have often been replaced by “repeat crossover” designs of HSP evaluations—especially in the UK. In this design, each hot spot serves as its own control. Using each day in each hot spot as the unit of analysis (hot spot-days), each hot spot is randomly assigned to different treatments on different days. Crime outcomes on treatment days, on average, in each hot spot are then compared to average outcomes on no-treatment days, within each hot spot.
The repeat crossover design opens the door for police to practice evidence-based policing by “testing-as-you-go” for continuous impact assessment (CIA). Unlike parallel track designs, repeat crossover designs also have a political advantage of giving every hot spot frequent attention, rather than denying hot spots policing to some locations for months (or a year) at a time.
With appropriate use of a pause in measurement between treatment crossovers, called a “washout” period for carryover effects of the last treatment to be “washed away,” police agencies can now introduce the “test-as-you-go” strategy of using experiments as an ongoing operating strategy. Using crossover testing as the means of continuous impact assessment, every hot spot can get extra patrols—just not on every day. By using each hot spot as its own control, the repeat crossover design can gain both statistical power and complete coverage of all crime hot spots. Several such studies have already found significant reductions in crime and violence on the extra patrol days compared to no-patrol days (Barnes et al, 2020; Basford et al., 2021; Bland et al., 2021). Similar designs could also be used for testing particular tactics in hot spots, such as traffic enforcement or stop-search.
The primary purpose of test-as-you-go is not to produce published studies for the accumulation of global knowledge about HSP (e.g., Braga et al, 2019); it is to prevent as much crime as possible, on a continuous basis, with local knowledge about the cumulative and most recent outcomes of the effectiveness of HSP in each specific hot spot. Tracking outcomes this way may lead to modifications in tactics or resources that improve subsequent test results and reduce crime.
The test-as-you-go strategy integrates the “Triple-T” by repeat crossover testing in tracking outcomes for new targeting. It uses random assignment by days as a permanent operating model, instead of the “test-once-and-stop” pattern of parallel track designs. Its key challenge is to make ongoing testing as valid as the “test-once-and-stop” designs.
The main issue is the residual or “carryover” effects of the last treatment (Barnes, et al, 2020; Koper, 1995; Sherman, 1990). It is therefore essential that the design of any continuous impact assessment (CIA) plan identifies the number of “washout days” needed between treatment categories, so that carryover effects can be “washed out.” Absent a break in measurement for a period of washout days without that HSP treatment, testing-as-you-go risks under-estimating the benefits of hot spot patrols.
That risk is growing as the spread of HSP increases. Over three decades since it was first tested in a parallel track randomized controlled trial (Sherman & Weisburd, 1995), “Hot Spots Policing” (HSP) may now be the most widely researched and adopted strategy of evidence-based policing (Sherman, 1998, 2013). With over eighty rigorous evaluations showing consistent benefits of HSP, no other policing strategy can offer more independent assessments (Braga et al, 2019; see also Barnes et al., 2020; Basford et al., 2021; Bland et al, 2021; Weisburd, et al, 2022). Hailing HSP as the crime reduction strategy with strongest evidence, one UK policing minister (Malthouse, 2021) has offered special funding for police agencies to implement it. In 2021–2022, there were 18 (of 43) police agencies using such funding, of which 13 established a repeat crossover design for continuous impact assessment (Rose, 2022).
Risks of Disappointment
While hot spots experiments have tested the effects of HSP in targeted hot spots, the precision of that aim can easily be confused with a general reduction in crime across a police force area. The two aims are not the same. One does not require the other to demonstrate effectiveness. If for no other reason, external forces (such as economics or population changes) could be driving crime up across a city, even while HSP is preventing crime from getting even worse city-wide. Yet, it is difficult to prove that HSP “works” across an entire police force when there is no comparison group to that force. This limitation makes HSP vulnerable to its critics.
Even leading criminologists (e.g., Nagin & Sampson, 2019) have argued that local crime reduction does not matter if a citywide benefit cannot be proven. While at least one quasi-experimental study of an entire city has shown a 7-year city-wide benefit (41% reduction in violence) of “system-level” HSP (Koper et al., 2021), the hot spots strategy is unlikely to have many other city-wide studies of long-term effects any time soon.
The risk of disappointing before/after results with HSP is even greater if the strategy is implemented with low levels of compliance or is abandoned after officer resistance emerged without adequate training and supervision (O’Connor, 2022). The main threat from this risk is that it does not differentiate between HSP working in some hot spots but not in others. By generalizing about HSP based on force-wide crime trends, disappointment could cripple HSP even before a police organization can begin to develop skill at the new strategy. In sum, the risk comes from putting all your eggs in one basket—the overall, average effect of HSP on all hot spots targeted—rather than considering HSP impact for each hot spot, one at a time.
The Promises of Test-As-You-Go
The risk of disappointment can be reduced by any method that allows hot spots to be examined individually—just as doctors treat patients individually, in light of each one’s individual circumstances. Much of the practice of medicine follows a “test-as-you-go” principle in which doctors first try one treatment (based in part on results of randomized trials), then switch to other treatments if the first choice did not improve the patient’s condition. This trial-and-error strategy is individualized at the patient level, even while it is informed by results from studies involving thousands of patients. Some patients may even comply more with some kinds of treatments (like taking pills) than others (like increasing exercise)—just as police compliance with HSP tasks may also be a major factor in whether that treatment works. By individualizing high-crime places in which crime does not respond to HSP, police leaders do not have to fix the entire strategic system. All they have to do is look at the facts for any one hot spot, and modify the specific tactics at that location.
A further promise of test-as-you-go is to accommodate better the vast spread of crime harm and volume from the highest-ranked to lowest-ranked hot spots. Even if 100 locations out of 10,000 in a city have half of all serious violence, the top-ranked location (#1) could have 20 times as much crime as the bottom-ranked location (#100). By customizing resources and tactics for each hot spot based on its crime frequency and harm, the test-as-you-go method can minimize the over-dosing of lower-level hot spots and under-dosing of the higher-harm hot spots. While a parallel-track design demands consistency of patrol time across inconsistently hot spots (Sherman & Weisburd, 1995), a repeated crossover design allows right-sizing of patrol time relative to each hot spots crime intensity.
Each Hot Spot Is Its Own Control
Just as each patient is their own “control” for a sequential series of treatments, each hot spot can be its own control for an ongoing comparison of two different treatments. There is a slight difference: doctors could choose to keep using the first treatment that seems to work adequately for each patient, while a police unit can continuously compare a single hot spot alternating two or more police tactics to see which one works optimally over time. That difference further strengthens the promise of a test-as-you-go policy.
But how can each hot spot serve as its own comparison? As explained below, test-as-you-go uses “repeat crossover” randomized trials to deploy added patrol to each hot spot so that it can serve as its own control. Simply by randomly assigning each day to a different treatment condition, such as being patrolled for 15 minutes on some days and not on others, the design can reveal the effects of that added patrol in that hot spot.
The “repeat crossover” design has been used in several studies published in the Cambridge Journal of Evidence-Based Policing, starting in 2017 with the first crossover experiment in hot spots policing (Williams & Coupe, 2017) ever published (to my knowledge). Since then, Basford et al. (2021) and Bland et al. (2021) have both published successful tests of HSP by using the repeat crossover design, both of which were influenced by Barnes et al. (2020). While none of these studies has examined differential effects of HSP on different individual hot spots, all of them used a research design that has the potential to do so.
Understanding that potential requires a re-examination of the two principal designs in field experiments: parallel track vs. repeat crossover. While statisticians have shown for almost a century that these are far from the only designs possible (Cox, 1958; Fisher, 1935), they have been the most frequently deployed strategies in evidence-based policy development in medicine, education, and other operational fields. Not every research question offers a choice between the two designs, as noted below. Hot spots policing, however, can clearly be tested with either design—as long as each experiment maintains internal validity.