Introduction

Recent events with law enforcement, including the killing of George Floyd by police officers and the shooting of Black motorist Daunte Wright by a 26-year veteran officer, have drastically altered the relationships between community members and law enforcement. With record numbers of citizens condemning police brutality and racism, protests across the nation calling for ‘defunding’ police, and anti-police sentiment that sometimes manifest in attacks on police, officers around the country may feel besieged (Hutchinson, 2020), complicating efforts for police to connect with communities (Montgomery, 2020). Also, the COVID-19 pandemic has intensified long-standing racism, attendant disenfranchisement of African-American citizens, and related economic and social inequities (Galea & Abdalla, 2020). These current experiences underscore the need for improved police-community relationships and the increasing importance of community-based policing approaches. Also, there is a need to look at past successful policing practices and assess their effectiveness in this new climate.

Problem-oriented policing (POP) strategies are often encompassed within a larger community- and place-based approach to policing (Goldstein, 1990). A considerable evidence base has emerged that has shown that interventions grounded in POP tend to produce larger reductions in crime at high-crime locations than those based solely on traditional patrol and enforcement measures (Braga et al., 2012, 2015; Braga & Bond, 2008; Hinkle et al., 2020; Taylor et al., 2011). However, while the original Goldstein’s definition of POP includes an emphasis on community involvement (Goldstein, 1979), POP efforts in practice often fall short of their ideal, with limited community involvement and agency support, and rely heavily on enforcement tactics (Cooley et al., 2018; Cordner & Biebel, 2005; Eck, 2006; Groff et al., 2015; Ratcliffe et al., 2015; Sherman et al., 1989; Weisburd & Braga, 2006).

The POP effort evaluated in this study intended to integrate community engagement in the POP process (Weisburd & Braga, 2013) in two mid-sized East Coast cities in the USA. This study, reported in this paper, represents a fairly large randomized controlled trial (RCT) of POP in two cities. We identified 60 and 42 crime hot spots in each city respectively and randomly assigned these hot spots to receive a community-infused version of POP or standard patrol services. Importantly, our RCT is one of the first to take place during the anti-police and social justice protest movements that erupted nationally in the summer of 2020, following the killing of George Floyd by police, and during a global pandemic. This is an important time in history where the effectiveness of POP under such circumstances is unknown. Our study provides some insights into how POP may have had beneficial or harmful effects in this unique context on reducing crime.

Hot spot policing

One of the most important criminological findings of the past decades is that crime concentrates in micro-places, also called hot spots (Weisburd, 2015). Hot spots are specific addresses, intersections street blocks, or clusters of street blocks that often have features or facilities that create criminal opportunities and facilitate offending (Eck & Weisburd, 1995). Research suggests that hot spots accounting for 5% or less of a city’s street blocks produce about half of its crime, and this pattern tends to be stable over time (Pierce et al., 1988; Sherman et al., 1989; Weisburd et al., 2004). Decades of research have shown that place-based policing strategies, or hot spot policing targeting locales where crime concentrates, are effective at reducing crime (Braga & Weisburd, 2022; Braga et al., 2012, 2014; Campbell Collaboration, 2018). In a review of nine experimental and quasi-experimental studies of hot spot policing interventions, Braga (2007) found that these efforts reduced crime or disorder in seven of the nine cases (Braga, 2007). Similar results were seen in Braga’s updated review, which identified 62 eligible hot spot policing intervention evaluations, 35 of them RCTs (Braga et al., 2019). Braga and colleagues found that a majority of the hot spot policing evaluations concluded that hot spot policing programs generated significant crime control benefits in the treatment areas relative to the control areas (Braga et al., 2019). However, more work is still needed to assess which hot spot policing approaches are most effective at reducing crime. Extant research on hot spot policing has explored strategies such as directed patrol (Sherman & Weisburd, 1995); order maintenance and drug enforcement crackdowns (Braga & Bond, 2008); raids on crack houses (Sherman et al., 1995); and other forms of problem-solving that have included situational crime prevention, nuisance abatement, cleanup activities, improvement of social services, and other measures (Braga & Bond, 2008; Eck & Wartell, 1998; Mazerolle et al., 2000; Taylor et al., 2011). As noted by the National Academies of Sciences back in the early 2000s (National Research Council, 2004), again in 2018 (National Academies of Sciences & Medicine, 2018) and still do this day, we do not have sufficient evidence to determine what types of police approaches are most optimal at hot spots, but some research suggests that these outcomes of reducing crime may be more likely when these interventions follow POP principles (Braga et al., 2014).

POP and hot spots

The POP model calls for police to move beyond reactive incident-driven policing by focusing on underlying problems that contribute to crime and disorder in the community, and taking proactive, preventive action against the causes of crime (Goldstein, 1979, 1990). Ten RCTs and nine quasi-experiments of POP (Braga & Weisburd, 2022; Braga et al., 2012) have found that interventions grounded in POP tend to produce larger reductions in crime than those based solely on traditional patrol and enforcement measures (Braga & Bond, 2008; Braga et al., 2012, 2015; Hinkle et al., 2020). POP may be particularly effective in the context of hot spots (Weisburd et al., 2010), insofar as focusing attention on these very specific locations can help officers to identify tangible conditions that contribute to crime/disorder at these places and to develop responses tailored to the specifics of these places and their problems. Besides targeted enforcement, problem-solving efforts at hot spots have included situational crime prevention, nuisance abatement, clean-up projects, and social services (Braga & Bond, 2008; Braga et al., 1999; Eck, 2002; Eck & Wartell, 1998; Mazerolle et al., 2000; L. W. Sherman et al., 1989; Taylor et al., 2011).

POP experts have also built systems to support a structured approach to POP through using the Scan, Analyze, Respond, and Assessment (SARA) model (Eck & Spelman, 1987). SARA is a systematic and analytical approach to problem-solving which starts with the police continually scanning their areas of responsibility, drawing on a variety of sources of information, in particular community residents, to identify key community problems (Cordner & Biebel, 2005). Next, police carefully analyze those problems to verify, describe, and explain them (Cordner & Biebel, 2005). Only after analysis should police move to responses, and when they do, they should identify and consider a wide range of responses before narrowing their focus down to the most promising alternatives (Cordner & Biebel, 2005). After implementing a response, the police should then assess the intervention’s impact to determine whether they need to try something else and to document lessons learned for the benefit of future problem-solving efforts (Cordner & Biebel, 2005).

While the SARA model is relatively straightforward, in practice it can be challenging for law enforcement agencies to consistently implement all of these steps (Cordner & Biebel, 2005). Critics have noted that in practice POP efforts often involve limited analysis and heavy reliance on enforcement tactics and other relatively easy situational crime prevention responses—what some have called “shallow” problem-solving (Braga & Bond, 2008; Cordner & Biebel, 2005; Eck, 2006). Some have also questioned whether officers have the training and skills necessary to implement the analysis and other elements of SARA (Cordner & Biebel, 2005), an element addressed in this study with a training program for officers in an adapted version of the SARA model. As noted by Groff and colleagues, these types of policing interventions can be difficult to implement consistently and measure effectiveness with an RCT framework (Groff et al., 2015), raising questions about the feasibility of maintaining an RCT of this type of intervention for a year across over 100 hot spots with patrol officers, as done in this paper.

Linking community policing and POP to address crime in hot spots

Goldstein’s original definition of POP included citizen input and involvement as key features (Goldstein, 1979). Also, an important element of SARA-guided POP initiatives is active participation from community residents (Cordner & Biebel, 2005). On the community-oriented policing (COP) side, the federal COPS Office’s original definition of COP includes problem-solving (Office of Community Oriented Policing Services, 2021). In practice, however, the implementation of both POP and COP has fallen short of these ideals, and the two have become differentiated in practice (Gill et al., 2014). Whereas the focus of POP is to find effective solutions to problems, it often does not meaningfully involve the community nor place emphasis on building positive community relationships (Gill et al., 2014). Also, community policing alone often does not include thorough problem analysis approaches as in the SARA model (Gill et al., 2014). In other words, a combined POP and COP approach is still rare (Weisburd et al., 2008). The few studies evaluating a combined POP and COP approach (Telep & Weisburd, 2012; Tuffin et al., 2006; Weisburd et al., 2008) often used less rigorous methods with one exception of an RCT focusing on youth (Weisburd et al., 2008).

The involvement of community members in problem-solving is particularly important in the post-Floyd era in which greater friction between communities and the police has been reported broadly (Ang et al., 2021; Buchanan et al., 2020; Kochel, 2019). This is likely a particular concern in African American communities that have experienced structural discrimination and marginalization for decades which can manifest in mistrust of police intentions and activities (Bylander, 2015; Radin, 2015).

The current study

We identified several gaps in the literature. First, rigorous RCTs of POP in crime hot spots are still not common (Braga & Weisburd, 2022; Braga et al., 2012), and large RCTs are even less common (Braga et al., 2019). Additionally, POP interventions tend to be under six months in duration (Braga et al., 2019). Second, while the SARA model is relatively straightforward, in practice it can be challenging for law enforcement agencies to consistently implement all of these steps (Cordner & Biebel, 2005). Third, POP efforts tend to involve little community involvement, parting from the original design of POP by Goldstein (Goldstein, 1979). Therefore, the field is still in critical need for large-scale rigorous RCTs evaluating longer-term POP efforts that emphasize community participation, especially in the post-Floyd era.

The experiment for this paper was conducted in two geographically proximate mid-sized cities, located within the same state in the east coast of the USA (South Atlantic). One RCT contained 60 hot spots (site A) and the other one contained 42 hot spots (site B). In each city, the identified hot spots were randomly assigned to receive a community-infused Problem-oriented Policing (CPOP) intervention or standard patrol services. All analyses were run separately for each city, providing a replication test for the efficacy of the CPOP intervention. The heaviest implementation of the intervention overlapped significantly with the COVID-19 pandemic and the Floyd-related protests of 2020 in site A. In contrast, the heaviest implementation of the intervention in site B was mostly over by the time of these events. Thus, our results provide insights into how POP may have had beneficial, harmful, or neutral effects on crime reduction in today’s context. The purpose of these two RCTs is to evaluate the efficacy of a hot spot strategy that involved regular hot spot patrol supplemented by problem-solving work and community engagement. Our emphasis on community engagement was also motivated by concerns about the need for better police-community relations and engagement that have been building since the Ferguson case where an African-American male Michael Brown was fatally shot by a White police officer in 2014 (Potterf & Pohl, 2018) and President Obama’s Task Force on twenty-first Century Policing (President’s Task Force on 21st Century Policing, 2015). The Floyd killing further intensified those concerns. In sum, we would like to examine whether POP can be executed successfully (1) for a long-term period (one year) to assess long-term effects and sustainability; (2) across large numbers of hot spots to provide a more rigorous test and to assess the practicality of implementation over a larger number of places; and (3) by patrol officers within the context of their everyday patrol responsibilities.

Methods

Research sites

The intervention was conducted in two research sites in mid-Atlantic states with mid-level crime rates, referred to as site A and site B. Police departments in the two jurisdictions both serve medium-size cities with similar agency and community characteristics. Site A has a strong background in implementing POP strategies while site B has a strong background in implementing community policing strategies; both sites benefitted from each other’s pre-established strengths to build on for the community-infused POP intervention.

Crime hot spots

In collaboration with the participating agencies, a total of 102 crime hot spots were identified across both research sites, including 42 hot spots in site A and 60 hot spots in site B. Crime hot spots were selected by geo-coding areas with the highest numbers of UCR part I violent crimes between 2015 and 2017, using ArcGIS software. The hot spots averaged 0.2–0.3 square miles in site A and 0.1 square miles in site B. The average buffer between hot spots was 911 feet (min. 165, max. 4,598, median 415) in site A and 606 feet (min. 56, max. 2,317, median 412) in site B.

Randomization was completed using a block randomization design within each site based on violent crime counts to make the treatment and control groups as comparable as possible. Four levels of crime blocks were created based on UCR part I violent crime counts- low, medium, high, and very high. The distribution of violent crime counts for the blocks was determined separately for each site to create an even number of hot spots for randomization and a similar number of hot spots within each block across sites for comparability (Weisburd & Gill, 2014). Crime blocks for site A were defined as less than 25 violent crimes in the low block (n = 16 hot spots), 26–49 violent crimes in the medium block (n = 14); 50–69 violent crimes in the high block (n = 6); and 70 or greater violent crimes in the very high block (n = 6). Crime blocks for site B were defined as less than 10 violent crimes in the low block (n = 24 hot spots), 11–29 violent crimes in the medium block (n = 20); 30–59 violent crimes in the high block (n = 10); and 60 or greater violent crimes in the very high block (n = 6). Hot spots were then randomly assigned to either the community-infused POP (CPOP) treatment or the control condition of standard patrol (“business as usual”) within each crime block (within each site) using SPSS-generated random numbers: 21 hot spots were assigned to receive the intervention and 21 hot spots were assigned to control in site A; 30 hot spots were assigned to each assignment in site B. After randomization, treatment and control groups were compared to ensure comparability across sites. There were no significant differences between the treatment or control groups on total crime count, area square miles, or percent business within each hot spot (Table 1).

Table 1 Sample description by site and hot spot treatment assignment

Description of the intervention

The intervention period spanned 16 months in site A (May 2019–August 2020) and 17 months in site B (March 2019- July 2020). Prior to the intervention, the research team conducted a half-day training focusing on POP and community policing strategies for officers leading the intervention efforts in both police agencies. The training covered POP and targeting the root causes of crime (Clarke & Eck, 2005; Eck & Clarke, 2003), strategies to engage community members in problem-solving and approaches to record intervention activities. Both agencies worked with the research team to develop recording systems for officers to log their hot spot visits, daily hot spot activities, and problem-solving efforts on the mobile computers equipped in police cars. During the training, the officers were trained in the use of this simple recording system.

Officers were instructed to conduct standard policing efforts involving traditional patrol operations in the control hot spots, without the introduction of the elements of POP. In the CPOP hot spots, a team of patrol officers was assigned to a specific group of CPOP hot spots that fell into their regular patrol/beat area. Typically, between one to three officers led the effort for the hot spots in their patrol/beat area assigned to receive the intervention. Each city had about 15 teams of officers, with some teams covering more than one hot spot. They worked on these hot spots on a part-time basis while also continuing with their normal patrol responsibilities (i.e., responding to calls for service). Officers were only told about the location of the treatment hot spots and were not aware of the location of the control hot spots. They were instructed to conduct standard patrol work in all non-treatment areas, including the control hot spots.

Teams were instructed to implement POP to identify and analyze specific crime and disorder problems in each hot spot and develop response strategies in conjunction with ongoing assessment for efficacy. Although patrol teams were encouraged to directly engage with community members, in reality, most projects implemented focused on traditional POP strategies. Examples of recorded activities included direct patrol, investigating particular individuals or addresses, conducting traffic stops and field interviews, talking with residents, business owners, and property managers about problems and efforts to address them, providing community members crime prevention tips, and conducting foot or bike patrols.

However, these activities and other forms of community engagement were limited by the COVID-19 pandemic, which overlapped with the months of heaviest project activity for site A (but less so for site B). COVID-19-related restrictions and changes to social routines reduced vehicle traffic as well as public activity. The agency also advised officers to minimize personal interactions with members of the public to the extent possible during much of this time. From April 2020 through August 2020, officers reported informal engagement with community members about problems on 3–8 occasions per hot spot per month on average. Other forms of proactive work and engagement were reported at similar or lower levels. As the heaviest implementation period for site A overlapped with the COVID-19 pandemic (April 2020-August 2020), some of these activities that involved community engagement were limited due to COVID-19-related restrictions and changes to social routines(e.g., reduced vehicle traffic as well as public activity). The site A and site B agencies also advised officers to minimize personal interactions with members of the public to the extent possible during much of this COVID period.

For the POP projects, depending on the key causes of crime identified, patrol teams focused on the offenders, the victims, or the larger community environment to implement a range of responses. For example, officers in site A identified the need for greater traffic enforcement to reduce speeding and traffic collisions in a hot spot and to identify a known repeat offender. Another POP project conducted in site A involved engaging the property manager of a residential building to address trespassing issues and distribution of illegal narcotics. In that project, officers proactively reviewed information on related arrests in the area to target “problem” tenants and conduct search warrants, leading to a decrease in calls for service in the area. An example POP project in site B was conducted after a shots fired call was received in a hot spot and larcenies were identified as a recurring issue. Officers reviewed police data to determine the timing of incidents. As a response, they increased patrol during that time, and enhanced community policing in that area by providing crime prevention tips to residents and property management.

We detected some variations in treatment dosage (i.e., time spent in hot spots) by both hot spot and time. The officers were not provided pre-intervention crime data about the hot spots. Therefore, it is unlikely they intentionally provided less treatment to hot spots that were somewhat lower in crime. The expectation was that all treatment hot spots should receive somewhat similar amounts of treatment, unless the POP work indicated more intervention was necessary. In total, officers in site A spent an average of 409.5 min across on average ten patrol visits per treatment hot spot per month and averaged about 40 min on these visits; for officers in site B, the corresponding figure was 200.26 min across on average ten patrol visits per hot spot per month, spending an average of only about 20 min per visit.

Patrols in site B were most frequent during the first three months of the project, when officers conducted an average of 18 to 22 patrols per hot spot per month and accumulated as many as 571 min per hot spot per month (over 12,000 min across all treatment hot spots for site B). The patrols began declining in the mid-summer of 2019 and continued declining through late 2019 and early 2020, due to agency-wide staffing reductions (which resulted in both the loss of project officers and delays in refilling their positions). By February and March of 2020, as the COVID-19 pandemic emerged, visits to hot spots averaged three to five per hot spot per month.

In contrast, in site A, command staff changes, staffing shortages, and other new initiatives delayed intensive and sustained project implementation, particularly in two of the agency’s precincts. As a result, recorded patrols averaged about 167 min across six-seven visits per hot spot per month for most of the first year (with total minutes across all locations often less than 5000 per month). The agency made a concerted push on the project in the spring of 2020, increasing project activity considerably in April 2020 and maintaining generally higher levels, with some fluctuations, through August 2020. During the peak months of the spring and summer of 2020, hot spot patrols averaged 843–1219 min across 21–25 visits per hot spot per month(total patrol minutes delivered to the hot spots ranged at maximum from approximately 17,700 to 25,600).

In addition to documenting the time spent in hot spots, officers also documented POP activities. Officers in site A recorded SARA activities including “scanning” (4.1 times per month per hot spot), “responding” (4.5 times per month per hot spot), “assessment” (2.6 times per month per hot spot), “proactive” (11.3 per month per hot spot). Officers in site B recorded more traditional proactive policing activities including citizen contacts (8.49 times per month per hot spot), felony arrests (0.03 per month per hot spot), misdemeanor arrests (0.07 per month per hot spot), criminal summonses (0.10 per month per hot spot), and traffic summonses (0.53 per month per hot spot), foot/bike patrols (0.64 times per month per hot spot), “knock and talks” (0.24 per month per hot spot), community meetings (0.06 per month per hot spot), and field interviews (1.16 per month per hot spot). As different types of POP activities were recorded for the two sites, we chose to use time spent in hot spots when distinguishing hot spots that receive high or low dosages.

It is important to note that for site A, as mentioned above, the second half of the intervention overlapped heavily with the COVID-19 pandemic. The onset of the COVID-19 pandemic in March 2020 quickly led to near ubiquitous closures of businesses, schools, and government buildings after stay-at-home orders from state and local governments went into effect. As a result, police agencies limited their contact with the public to the extent possible and many proactive activities emphasized by the intervention were no longer possible. Widespread sheltering in place also disrupted routine activities and significantly altered crime patterns. The widespread effect of COVID-19 on police activities, crime patterns, and everyday life might have also muted the program’s effect on crime in both places.

Concurrent to pandemic-related impacts were the nationwide (including in the two project sites) protests against police brutality in the wake of George Floyd’s death in May 2020 (Bliss, 2020; Buchanan et al., 2020). These protests accelerated the trend of anti-police sentiment on the rise since the early 2010s after several high-profile police use-of-force cases (Kochel, 2019). Increased tension between the community and police may have hindered the ability of officers to improve relations among residents in their jurisdictions. Attitudes and cooperation with police are at historically low levels (Ortiz, 2020; The Associated Press-NORC Center for Public Affairs Research, 2015). For these reasons, it has become increasingly imperative to find ways for officers to effectively rebuild these relationships.

Measures

To measure crimes in the hot spots, we collected uniform crime report (UCR) data in crime hot spots from both agencies. We also tracked monthly intervention data by collecting paper-based or online forms that officers working in each hot spot filled out during the intervention period.

Monthly uniform crime reporting data

Monthly uniform crime reporting data for each hot spot during the period before, during, and after the intervention were provided by the two police agencies (no missing data) and used to measure “part I” property and violent crimes. For each hot spot, property crime was coded as the total count of all Part I property crimes (burglary, arson, shoplifting, theft) per month. Violent crime was coded as the bi-monthly total count of Part I violent crimes (homicide, aggravated assault, sexual assault, robbery) in order to accommodate the low violent crime counts in each month. On average, hot spots in site A experienced 5.19 property crimes and 1.17 violent crimes per month, and those in site B experienced 2.18 property crimes and 0.24 violent crimes per month.

Hot spot characteristics

Hot spot characteristics were included in the analyses as covariates, including crime level (low, medium, high, very high) used as blocking factors, area square miles of the hot spot, and percentage of business addresses in the hot spot. We also included a categorical variable to indicate whether the period is before, during or after the intervention.

Intervention dosage

Time spent in each hot spot across all POP and community policing activities was recorded by month. To account for the variation in intervention dosage by hot spot, we further coded hot spots into either low treatment or high treatment based on dosage. The low/high treatment classifications were calculated separately by site in the following steps. This coding method (rather than a simpler method of cutting at the median for the total minutes spent across the entire intervention period) allows us to account for the variability of time spent in each hot spot across the intervention months. Step 1: For a given month, a hot spot was considered to have received a high treatment level if it received greater than 410 min (the average number of time spent between May 2019 and September 2020) of project-related activities in site A or greater than 201 min (the average number of time spent between March 2019 and April 2020) in site B. Step 2: The percent of months in which a hot spot received a high dosage was then calculated. For example, if a hot spot in site A exceeded 410 min during five months out of the total intervention period of 16 months, the percent months of that hot spot receiving high dosage would be 5/16, or 31%. Step 3: A median for the percent of months of receiving high-dosage treatment was then calculated across all hot spots in each site. Step 4: A hot spot was considered to be in the high treatment group for its city if the percent of months that it received high dosage was above this median percentage.

Seasons

We added variables, Spring 2018 through Winter 2020, to control for seasonal effects (McDowall et al., 2012) and account for COVID-19 pandemic-related effects on police operations (Abrams, 2021). Given the length of the intervention period of over 16 months, it is important to consider whether were seasonal effects for the intervention. For property crimes, we included a categorical variable defined according to astronomical seasons – January to March of each year were coded as “Winter,” April to June as “Spring,” July to September as “Summer” and October to December as “Fall.” For violent crimes, since the analysis unit was bi-monthly, we combined summer and fall to create one category and winter and spring to create another category.

Analytic plan

Separate analyses for the two sites were conducted with R software (version 4.0.2). Our analyses examine changes in crime pre-intervention (January 2018–April 2019 for site A and January 2018–Feburary 2019 for site B), during-intervention (May 2019–August 2020 for site A and March 2019-July 2020 for site B), and post-intervention (September 2020-December 2020 for site A and August 2020–December 2020 for site B).Footnote 1 For an exploratory crime trend analysis, we compared the means for each period separately assuming a Poisson distribution for the counts and accounted for the spatial and temporal correlation between data points coming from the same hot spot and/or time period. We then fit multivariate hierarchical Poisson regression models to the counts of property and violent crimes in each hot spot and accounted for both temporal and spatial correlation when specifying the variance components. We also tested for over-dispersion in the models and concluded that Poisson models fit the data well.

For both property and violent crimes in each site, two sets of models were fit. In model 1, we study the effect of a two-level treatment (treatment vs control). In model 2, a three-level treatment considering dosage (high treatment vs low treatment vs control) was used to represent the study treatment effects. To examine the intervention effect in the during and post-intervention period, and to control for the chance of any differences in the treatment and control hot spots during the pre-intervention period, we included interaction terms between treatment status (treatment/control in model 1 and high treatment/low treatment/control in model 2) and the intervention period (pre-, during- and post-intervention). The directionality and magnitude of the intervention effect was evaluated by time period. Specifically, we tested each contrast of Treatment (high or low separately in the models with three-level treatment) vs control, separately in the pre-, during- and post-intervention periods. We used Sidak’s correction to control the familywise error rate (Holland & Copenhaver, 1988). In all models, we controlled for hot spot size (normalized square miles), the percentage of a hot spot’s addresses that were businesses, areas and accounted for the temporal correlation within hot spots and possible spatial correlation between hot spots. We also controlled for seasonal variation, and the hot spot’s crime block (low, medium, high, and very high).

Results

Tables 1 and 2 summarize the characteristics of the hot spots in each site by treatment status. As presented in Table 1, we observe similar distributions of crime counts, area square miles, percentage of business, and population size for the treated and control hot spots overall in both site A and site B. By design, the distribution of the crime blocks, and the number of hot spots is the same between the control and treatment groups in both cities. As presented in Table 2, however, violent crime counts in the low-treatment hot spots were significantly lower than those in both the high-treatment and control hot spots in site A prior to the intervention. In addition, pre-intervention property crime counts in the high-treatment hot spots were significantly higher than those in the control hot spots in site B. These patterns suggest that officers working the treatment locations in both cities tended to focus more of their efforts on locations that began with higher crime.

Table 2 Sample description by site and three-level hot spot treatment

Tables 3 and 4 present property and violent crime counts by treatment status, overall and separately during the pre, during- and post-intervention periods in the two sites. As presented in Table 3, we note higher property crime counts in the treatment hot spots than in the control hot spots overall and separately by the three periods in both sites, though these differences were not statistically significant. Violent crime counts also did not differ significantly across treatment and control locations, with the exception that treatment locations in site B had higher counts during the post-intervention period. However, this difference was not statistically significant when controlling for multiple testing (Šidák, 1967).

Table 3 Description of intervention activities: Crime counts by site, treatment assignment, and time period
Table 4 Description of intervention activities for three-level treatment: Crime counts by site, treatment assignment, and time period

We observe a downward trend in the average property crime counts for both the treated and control hot spots over time in site A, whereas the trend is not monotonic for site B. The violent crime counts are lowest during the pre-intervention period for both the treatment and control hot spots in site A. In site B, we noted a slight increasing trend for the treatment hot spots and a slightly decreasing trend for the control hot spots.

As presented in Table 4, the property crime counts were significantly higher in the high treatment hot spots than in the control hot spots in site B during the pre-intervention period. The violent crime counts were significantly lower in the low treatment hot spots than in the control hot spots in site A during all three periods.

Table 5 reports the log-linear regression results of property crime counts by treatment assignment in site A and site B. Model 1 includes a binary treatment effect, i.e., treatment vs control (control as the reference category), and model 2 includes a three-categorical treatment effect i.e., low treatment, high treatment, vs control (control as the reference category). We observed seasonal effects in both models, using Winter 2018 as the reference category. For example, in the site A model 1, property crime in Winter 2019 was around 13% lower than in Winter 2018 (RR = 0.874). Similarly, the crime counts in spring, summer, and fall of 2020 were over 30% lower than those in winter 2018 (RR ranges from 0.566 to 0.685). Further, for each standard deviation increase in area square miles, there is a 17% increase in property crime counts (RR = 1.174). With each standard deviation increase in percent of business addresses, we observed a 33% increase in property crime counts (RR = 1.325). In both models, neither the treatment variable nor the interactions between treatment and time periods were statistically significant. We observed similar findings for site B.

Table 5 Log-linear regression models of property crime counts by treatment assignment

Table 6 reports the results for violent crimes. We observed some significant effects of the treatment in site A. In site A model 1, we observe that the treatment hot spots have about 17% lower violent crime than the control hot spots (RR = 0.825), representing a baseline pre-intervention difference due to our model configuration. In site A model 2, we note that the low-treatment group has a 35% (RR = 0.654) lower crime rate than the control group, also driven by a baseline pre-intervention difference. Such effects cannot be interpreted alone when interactions between treatment and time periods are included in the model. Next, we examined the interaction between the binary/three-categorical treatment status and time period. For site B model 1, we observed that the treatment/control difference was 160% (RR = 2.588) higher in the post-intervention period compared to such difference in the pre-intervention period. This is driven by the difference between hot spots receiving a low treatment dosage vs those in the control condition (i.e., not a main effect). Model 2 for site B shows that the low treatment/control difference during the post-intervention period is 280% (RR = 3.8) higher than such difference in the pre-intervention period.

Table 6 Log-linear regression models of violent crime counts by treatment assignment

These findings of no main effects are further corroborated in Table 7 (property crime) and Table 8 (violent crime), where we present the treatment–control differences separately for the pre-, during- and post-intervention periods, as well as the comparison of treatment–control difference across these three time periods (also see Figs. 1 and 2). As presented in Table 7, none of the treatment vs control hot spots differences is significant in any of the three time periods for property crime.

Table 7 Contrasts for binary and three-categorical treatment effect models for property crime counts
Table 8 Contrasts for binary and three-categorical treatment effect models for violent crime counts
Fig. 1
figure 1

Crime counts in site A between January 2018 and December 2020

Fig. 2
figure 2

Crime counts in site B between January 2018 and December 2020

Table 8 shows some significant differences in violent crime counts, but only between low treatment and control hot spots in both sites. As presented in Table 8, in site B model 1, despite the non-significant difference during the pre-intervention and intervention periods, the treatment hot spots had a significant 160% (RR = 2.575) higher violent crime count compared to the control hot spots in the post-period time, presenting a possible backfire effect. However, when further looking at the three-level treatment effect (model 2), high treatment vs low treatment vs control, the difference is only between low treatment hot spots and control hot spots. That is, the violent crime count in the low treatment hot spots was 200% (RR = 3.256) higher than that in the control hot spots during the post-intervention time period, but there were no significant differences between these groups in either the pre- or during-intervention time periods. This signals that the potential backfire effect observed is only in hot spots that received low level treatment (which was not randomized), and not in the hot spots that received high level treatment. A potential backfire effect was also observed in hot spots that received low level treatment in site A. In site A model 2, the hot spots that received low-level treatment had about 35% (RR = 0.654) lower violent crime than control hot spots in the pre-intervention period. Such difference became non-significant in both the during- and post-intervention time periods. This suggests a possible steeper increase in violent crime in the low-level treatment hot spots than in the control hot spots.

Discussion

This study involved two fairly large RCTs of POP. The intervention was implemented over a longer period (16–17 months in two cities) than many POP interventions that are typically six months or less (Braga et al., 2019). The experimental design, with strict adherence to the randomization protocol by the participating law enforcement agencies, provides confidence in the comparability of the treatment and control conditions in this study. Additionally, we had the opportunity to replicate our findings by implementing our RCT in two cities. This study, mainly site A, serves as one of the first evaluations of POP strategies after large increases in police-community tensions following the death of George Floyd by police in Minneapolis in June 2020 and the COVID-19 pandemic.

Past studies of POP examined intervention effects by comparing treatment vs control and have rarely consider the natural variation in implementation dosage and length (Braga et al., 2019), especially with a longer intervention period of over a year. In this study, we filled this gap by explicitly taking into consideration that the dosage varied across locations throughout the intervention period. In addition to comparing the treatment and control hot spots, we created the three-category treatment variable, low treatment, high treatment and control, to compare treatment hot spots that received high level of treatment with those that receive low level of treatment. We observed that crime counts in the low treatment hot spots were lower than both the high treatment and control hot spots in site A during the pre-intervention period. However, given that we did not randomize the levels of treatment, this is not completely unexpected that there might be uneven variation on the intensity of the treatment. Such difference was taken into consideration in the multivariable analyses.

Our multivariable analyses assess the effects of the intervention on violent and property crime and in the two sites. We assessed changes in property and violent crime counts for treatment and control hot spots overall and separately during pre, during- and post-intervention period in the two sites. We found no intervention effect on property crimes in either study site, but some backfire effects of low-level treatment on violent crimes. In site B, despite the non-significant difference during the pre-intervention and intervention periods, the treatment hot spots had significantly higher violent crime count compared to the control hot spots in the post- intervention period. However, such difference was only significant between low treatment hot spots and control hot spots. That is, the violent crime count in the low treatment hot spots was 200% higher than that in the control hot spots during the post-intervention time period. Therefore, we conclude that the observed backfire effect on violent crime in site B is only in hot spots that received low level treatment, and not in the hot spots that received high level treatment. A similar backfire effect was also observed in hot spots that received low level treatment in site A. In site A, the hot spots that received low level treatment have about 35% lower violent crime than control hot spots in the pre-intervention period but increased to a similar level as the control hot spots in the post-intervention period.

It is important to point out that we did not detect a clear backfire during the intervention period in each site. The low/high treatment designation was not based on random assignment, but rather, officers paid less attention to treatment locations with lower levels of crime. Hence, we would not necessarily expect the lower treatment locations to have crime levels and trends equivalent to those of the comparison group. Therefore, the results may be simply driven by the natural deviations between the low treatment group and the control group. In addition, we cannot rule out a reporting effect. It could be that increased police activity in the low treatment locations was enough to prompt more crime reporting and detection but not yet enough to have a meaningful impact on crime.

The intervention faced several challenges, which contributed to the finding of no observed crime reduction in property crime and a small but significant backfire effect on violent crime in hot spots that received low-level treatment. First, both agencies experienced shortages of officers as both agencies went through administration changes in the beginning of the study and later due to the COVID-19 pandemic, making it challenging to implement the full planned dosage of POP. Second, officers in site A experienced great challenges in involving community members due to the COVID lockdowns starting from the spring of 2020 (site B had implemented most of their intervention before COVID-19). Third, even though we conducted a half-day training for officers prior to the intervention, due to the challenges mentioned above and limitations of funding and time, the POP efforts did not follow strictly a SARA model. Rather, such efforts often fell short and heavily relied on patrol, enforcement, and other simple tactics (e.g., situational crime prevention measures)—what some refer to as “shallow” problem-solving (Braga & Bond, 2008; Cordner & Biebel, 2005; Eck, 2006). In other words, despite the original plan, the community engagement and problem-solving efforts were not as systematic, formalized, and extensive as planned, reinforcing concerns about the difficulty of implementing strong POP in a patrol context, particularly over a long-term period.

Our results suggest that POP strategies, moderately infused with community policing, were not effective in reducing crime in this unusual context and there is even some suggestion that it might even lead to increases in crime when implemented at very low levels. Perhaps low levels of POP might begin to engage the community in an intervention but send the wrong message to criminals when that engagement is at a low level. That is, these levels of implementation might embolden those interested in committing crime that the policing activities will be too weak to have any effect on their criminal activities. While our RCTs were large enough to find effects and the randomization produced comparable groups for analysis, due to our concern of modest levels of implementation of POP, we caution interpreting our results as making a case for stopping implementation of POP across law enforcement.

We encourage replication studies to assess whether the backfire effect in low-level treatment hot spots is a true backfire effect. Based on the finding of the potential detrimental effect of low-level treatment, we recommend that there is a need for a higher and more consistent dosage of POP efforts that engage the community. That is, even in long-term POP programs like the one we tested, there is still a need to maintain sufficient and consistent dosage levels to produce the desired reduction in crime.

While some of the circumstances limiting these efforts in our study sites may have been unique (e.g., the COVID-19 pandemic and post Floyd period negative sentiments in the community during the height site A’s implementation of POP), other researchers have also documented the difficulty of implementing rigorous problem-solving in the context of patrol work before the pandemic (Cordner & Biebel, 2005; Groff et al., 2015). Nonetheless, experience elsewhere has shown that with strong organizational commitment and formalized processes to regularly track, manage, and support patrol activities, consistent implementation of hot spot policing strategies can be achieved even over long periods and across large numbers of hot spots (Koper et al., 2021).

These study findings need to be considered within the context of our recognized study limitations. As noted above, we have good evidence that our random assignment process worked as planned and created comparable treatment and control conditions. However, one downside of our experimental assignment process was the creation of somewhat artificial conditions under which we asked the law enforcement agencies to operate. While the officers in our study carefully followed the assignment pattern dictated by the experiment, confining officers working in small hot spots is not likely how they would normally operate. They would likely prefer to move more naturally through their service areas and focus on hot spots at their discretion and achieved different results due to using a strategy they prefer. As noted earlier, the agencies also struggled providing POP at high levels. The agencies struggled with not only doing rigorous POP and community engagement, but they sometimes struggled with just getting to the hot spots to provide a meaningful presence. Even in the high-dosage locations, their presence may not have been regular enough to register an effect. Our results may reflect more on the difficulty of doing consistent POP work than they do on the effectiveness of POP per se.

Next, this paper was limited to official police measures (UCR) and all of the problems associated with UCR measures (Klinger & Bridges, 1997; L. Sherman et al., 1989). Given that UCR measures the occurrence of crime with error they can lead to some inconsistent results. Also, our study focused on Part I property and violent crimes and we did not measure lower-level forms of crime which the intervention might have had an effect. Another consideration is that the two cities in our study had relatively modest crime problems (perhaps greater effects could be achieved in places with higher baseline crime levels), and both agencies had ongoing enforcement and community engagement activities that were targeted to varying degrees on high-crime areas during our study. Although these efforts were not as precisely targeted as the CPOP program, they arguably had the potential to confound measurement of CPOP’s effects.

In conclusion, the results of this RCT are an important test of POP, in yet another study but some very different circumstances than prior work (i.e., for site A occurring during a multi-year global pandemic and during a time of large increases in police-community tensions following the murder of George Floyd by police in June 2020). Our results suggest that POP, infused with community policing was not effective in reducing crime and there is even some suggestion that it might even lead to increases in crime when implemented at very low levels. Given the challenges of implementing the intervention during this unique time in history, we caution the interpretation of the findings as evidence to invalidate decades of work showing hot spot policing and POP to be effective in reducing crime. We, thus, encourage other researchers to continue to evaluate place-based POP with community-oriented policing and emphasize heavier and more consistent implementation, as POP might yet be shown to be a viable tool to decrease crime and improve police-community relations even during unusually difficult times for communities and police departments.