Introduction

What would it take to reduce unnecessary police use of force? At present, many police departments globally are attempting to do that by equipping police officers with body-worn cameras (BWCs) in order to potentially de-escalate volatile encounters through the deterrent threat of apprehension for noncompliant behavior. Billions of taxpayers’ dollars (Friedman 2015), as well as headlining news coverage, are not mirrored by a similar increase in research evidence on this new technology (Lum et al. 2015; White 2014).

We have previously reported results from a global multisite randomized controlled trial on the effect of BWCs on various outcomes, including use of force, complaints against the police, and assaults against officers (Ariel et al. 2016). Averaged over ten trials, we reported that the use of police BWCs had no overall effect on use of force. However, our results varied, with force increasing in some trials and reducing in others. These conflicting results were puzzling and disturbing. Why would officers, knowing that their actions were being filmed by their own equipment, choose to apply force more often when cameras were on in some instances? Similarly, why would suspects’ demeanor become more aggressive or noncompliant under these circumstances? This runs contrary to both common sense and a good deal of research across disciplines on the effect of deterrence on compliance behavior as well as the law (see Nagin 2013; in the framework of BWCs, see Ariel 2016). Here we report on planned subgroup analyses aimed at disentangling both of these effects.

Background

Police are, to mix two phrases spanning the Atlantic, the “thin blue line” that “protects and serves”. However, as the legitimate use of force rests with police officers, citizens should be—and are in fact—concerned that force is used proportionately and fairly. Time and again we learn that some police officers use excessive force in a manner wholly unnecessary for a situation (The Guardian 2015), or are unable to de-escalate tense engagements with members of the public (see Piquero et al. 2006; Sherman 1980). Similarly, we know that some citizens’ demeanor promulgates use of force, often through verbal or physical assault of officers (Reisig et al. 2004; Terrill and Mastrofski 2002; but cf. Engel et al. 2000); these instances are likely to result in aggressive arrests and use of more police force. Thus, reducing police use of force is a laudable aim for law enforcement agencies as this can improve public perceptions of police, whilst at the same time making policing more effective (Mazerolle et al. 2013; The White House 2014).

“Force” can be used to achieve a lawful objective, such as making a lawful arrest, subduing a resisting individual, acting in self-defense, or protecting others (College of Policing 2015). However, measuring what “police force” is, at which point it becomes excessive, unnecessary, or disproportionate (Harris 2010)—or even who instigates the use of force beyond what is required—is far from clear (Ariel et al. 2014). There is no tracking system of police force that is completely reliable or even valid—since the amount of “force” necessary in a given situation is subjective, primarily self-reported, and heavily underreported (Hickman et al. 2008). Yet at its core, police use of some force is an essential requirement against certain offenders, under specific circumstances. The issue is not whether or not force needs to be applied, but how it can be minimized.

It is at this juncture that BWCs enter: the basic motivation for police BWCs is, among many other things, to reduce use of force (Miller et al. 2014). The theoretical basis for the use of cameras—that being monitored changes behavior—is deterrence theory (see review in Ariel et al. 2014:4–6). Police BWCs—at least as implemented in our trials—fulfill the causal mechanisms of deterrence to regulate police–citizen encounters. BWCs increase the perceived certainty of apprehension for rule violations. In this sense, the underlying assumption is that officers will use excessive or unnecessary force less frequently than during control conditions. From the citizen/suspect side of things, assaulting the officer, resisting arrest or committing detectable offenses is also bound to lead to further sanctions, which (rational) actors will tend to avoid (see Cornish and Clarke 2014)—a point we return to in the conclusion. When BWCs are actually turned on and appropriately activated, they can efficiently detect rule violations and law breaking by officers and/or citizens—and this process can send a credible deterrence threat (see Jervis et al. 1989:3). In this sense, BWCs are unlike CCTVs (Welsh and Farrington 2009), dashboard cameras, or bystanders’ mobile phone cameras. For a more elaborate discussion on this theoretical mechanism behind BWCs, see Ariel (2016:105–109).

A recent review of the available evidence conducted by Lum et al. (2015) has shown that there are, currently, 12 existing empirical studies of BWCs and about 30 ongoing research projects (see also review by White 2014). While there were attempts to implement BWCs in policing nearly a decade ago (Goodall 2007; Harris 2010), evidence on their effectiveness has only surfaced in the last couple of years (see Lum 2015). Four of the studies employed randomized controlled trials (Ariel et al. 2014; Grossmith et al. 2015; Jennings et al. 2015; Owens et al. 2014), and others have used less robust designs (e.g., Ariel 2016). The most recent published studies employed different units of analysis to measure various consequences of the use of BWCs in police routine operations: in Orlando, Florida, the unit of analysis was the individual officer (Jennings et al. 2015); Ready and Young (2015) compared volunteers versus non-volunteers, while others have used a version of cluster random assignment (Grossmith et al. 2015). Lesser designs, without random allocation of units into treatment and control conditions (e.g., Ariel 2016; Ellis et al. 2014), have produced mixed results about the effectiveness of BWCs, ranging from supportive evidence, to null, and even backfiring effects.

The Rialto experiment (Ariel et al. 2014) was the first randomized controlled trial that looked at the effectiveness of BWCs, and specifically focused on use of force and complaints. Rialto Police Department, a jurisdiction in California with just over 50 frontline officers, compared nearly 1,000 police shifts during which all police–public encounters were equally assigned to either treatment or control conditions. During treatment shifts, Rialto officers were asked to record (audio/video) all their encounters with members of the public and to store evidence on a secured cloud. In the control shifts, the officers were instructed to never carry the devices. Outcomes were then measured, in terms of officially recorded use of force incidents and complaints lodged against Rialto police officers. Following this 12 months experiment, Ariel et al. (2014) reported a relative reduction of roughly 50 % in the total number of incidents of use of force compared to control conditions, and about a 90 % reduction in citizens’ complaints, compared to the 12 months prior to the experiment.

Ready and Young (2015) conducted an experiment with the Mesa, Arizona Police Department. The study analyzed nearly 3,700 field reports completed by 100 sworn patrol officers. Random assignment of the officers into treatment and control groups has resulted in several important findings: first, officers who did not wear body worn cameras were more likely to conduct stop and search, and were also more likely to make an arrest. This means that wearing BWCs may cause officers to be more cautious and risk-averse than officers in control conditions. At the same time, treatment officers were more likely to give citations and initiate encounters. This suggests that BWCs may cause officers to be more proactive with this technology, however without increasing their use of invasive strategies that “may threaten the legitimacy of the organization” (Ready and Young 2015:445).

Finally, Jennings et al. (2015) have also observed the effect of BWCs on policing, but focused particularly on response-to-resistance incidents. In their controlled experiment, they randomly assigned 46 (of 89) officers to wear BWCs, with the remaining 43 officers assigned to a no-BWC condition. The study has shown that BWCs reduced these types of incidents and serious external complaints. The prevalence of response-to-resistance incidents and the prevalence and frequency of “serious” external complaints were significantly less for officers randomly assigned to wear BWCs (p. 480).

These studies suggest that the credible deterrent effect of BWCs may rest on four critical points: that the camera is (i) actually worn by the officer, (ii) turned on, and (iii) used during the police–public encounter. It also requires that (iv) suspects are fully cognizant of the BWC, for instance through officers announcing to the suspect that the interaction is being officially recorded. If the officer applies discretion and does not record an encounter—for whatever reason—then the deterrent effect of the camera will be nil. To use a medical analogy, one must take a pill in order for it to be effective.

It is precisely the issue of discretion where we believe that the effect of BWCs can vary, and the continued debate around police officers’ discretion is at the core of our present analysis. “Police work is complex [,] police use enormous discretion [, and] discretion is at the core of police functioning” (Kelling 1999:6). Discretion underpins policing by consent (Leyland 2012). Yet “discretion” per se is a vague term that broadly reflects the decision-making power afforded to officers with the aim of allowing them to decide whether or not, and to what extent, to follow police procedures (Hirby 2015). Codified rules and regulations often limit the use of discretionary powers—for instance in the area of mandatory arrests for domestic violence (see review in Hirschel et al. 2007), shooting at moving vehicles (e.g., Alexander City Police Department 2008), holding suspects in a chokehold position (e.g., NYPD Patrol Guide 2013), or the use of ‘enhanced interrogations’ on suspects under extreme circumstances. However, in the area of use of “force”, the guidelines are suggestive: the force continuum, which is meant to align the appropriate amount of force to various scenarios in order to de-escalate the situation, can be overridden based on the officers’ subjective perception at the time of the incident (see Watson et al. 2008). It is only with the clear vision of hindsight that a “forceful incident” might be deemed to have resulted from an abuse of decision-making powers. The core of the analysis presented below is to understand what role police discretion plays in the emergent area of police BWCs.

Methods

Experimental procedure

The experimental design underpinning this paper has been reported elsewhere (Ariel et al. 2016) so is only summarized here. Jointly, the trials involved 2,122 patrol officers in eight police departments, with 2,188,712 officer-hours. The unit of analysis across ten trials is the police shift, in keeping with the Rialto Experiment (Ariel et al. 2014). Each week, shifts were randomly assigned to ‘cameras on’ or ‘cameras off’. This is the most practical approach to implementing BWC trials with police, because it means that even small forces, such as Rialto with only 54 officers, are able to leverage sample sizes as high as nearly 1,000 shifts per year (Ariel et al. 2014). This resulted in 4,915 shifts being assigned (M = 491.50; SD = 276.99 per site), with no differences between treatment and control conditions in terms of the distribution of shifts (Ariel et al. 2016).

In practice, our pre-published protocol stripped officers of their discretion to decide when, where, and under which conditions BWCs would be applied. Following from the same theoretical mechanism behind surveillance underpinning the Rialto experiment (Ariel et al. 2014; see also Tankebe and Ariel 2016), we felt that officers should not be able to make ad-hoc decisions as to whether or not BWCs should be used on a case-by-case basis. To try and minimize officer discretion, each participating force agreed to the following treatment conditions, with the emphases in the original experimental protocols (verbatim):

  1. 8.

    TREATMENT AND COMPARISON ELEMENTS

  2. 8.1.

    Experimental or Primary Treatment

  3. 8.1.1.

    What elements must happen, with dosage level (if measured) indicated.

  4. 8.1.1.1.

    Wearable, personal cameras attached to each patrolling officer during experimental shifts, with capability of capturing and recording police interaction with the public (offenders, witnesses, victims), in both colour video and audio.

  5. 8.1.1.2.

    Cameras must be turned on during every interaction with the public as soon as officer(s) get out of the police vehicle, until

  6. 8.1.1.2.1.

    Situation stabilises and/or;

  7. 8.1.1.2.2.

    Offender is brought into custody.

  8. 8.1.1.3.

    Members of the encounter must be notified through a script (i.e., “you are being recorded on tape”)

  9. 8.1.1.4.

    Cameras must be worn on the uniform and visible, during experimental shifts.

  10. 8.1.2.

    What elements must not happen, with dosage level (if measured) indicated.

    Element A: cross-over

The overall result from our previous analyses (Ariel et al. 2016) was that there were no significant differences between treatment and control arms [standardized-mean-difference (SMD) = .021; SE = .056; 95 % CI  −.089 to .130)]. Taken at face value, the nonsignificant overall results provide weak support for the deployment of BWCs in policing, as nonsignificant results are difficult to interpret (on methodological challenges associated with nonsignificant experimental results, see discussion in Ariel 2012: 55–58). However, as Ariel and Farrington (2010: 449–450) concluded, subgroup analyses are a natural step that comes after testing for main effects, particularly when the findings are heterogeneous. This was the case in this multisite study, where significant variability existed in the sites included in our prospective meta-analysis (Q = 17.902; p < .05; I 2 = 49.7 %). Therefore, we sought to conduct further analyses with the data, as they may provide valuable information about the benefits and hazards of the intervention in subsets of participants. To be clear, we present results from pre-planned subgroup analyses on the efficacy of the treatment for particular groups of interest, in a pre-specified manner. We did not perform these analyses when presenting the preliminary main effects in Ariel et al. (2016), as the data we were interested in—police officers’ discretion—were not available at the time.

Defining use of force

Police use of force was measured by official self-reported logs by police officers in the line of duty. For the purpose of this study, “use of force” was defined as any application of physical restraint on the force continuum beyond handcuffing, in order to gain control of a suspect or situation. Police regulations often dictate that whenever officers use force during a shift they are obliged to report this— although practice and form vary from place to place. As one participating force dictated, “The decision to record use [of force] is a matter for the commander and must be entered in the commander’s policy log and the need for recording stated in the operational order or briefing.” Force incidents were calculated as a rate per the number of arrests—as a way of standardizing this measure and accounting for different sized police forces involved in the trials. We are cognizant that not all incidents of use of force are recorded by officers, for various reasons discussed elsewhere (Sommers 2013); however, we assumed that the error rate in recording violations would be stochastic in treatment and control conditions, given our experimental design.

Measuring compliance/integrity

We defined “treatment integrity failure” in two ways: first, when the police department explicitly stated that its officers were granted discretion to wear and use BWCs, and record incidents despite the experimental protocol. This noncompliance could be either during control shifts, such that officers used the devices when they were not supposed to, or if they were able to decide when to turn devices on during individual encounters (the protocol stated that “every” encounter was to be recorded as soon as the officer arrived at the scene).

Our second measure of treatment integrity failure was more detailed. Meta-data automatically collected by camera manufacturers created an objective measure: Any footage recorded during control shifts was considered a violation of the protocol—that is, a sign of noncompliance. A control shift with at least 1 min of recorded footage was considered a breach of protocol, as the devices should not have been worn at all during control shifts. Conversely, when officer cameras did not record any amount of data during a shift, this was an indication that the camera had either been left back at the station, or switched off during the shift; we nominally defined a treatment shift without any downloaded footage (0 min or 0 megabytes downloaded) as a protocol violation, as it is unreasonable that any officer on duty would have nil interaction with members of the public in middle or large police departments. There were some pre-agreed types of cases where officers were allowed to switch off cameras—such as serious sexual violence cases or dealing with informants—but otherwise officers were required to keep cameras on throughout their shifts and record every interaction with members of the public.

Table 1 reports on compliance of each site. As shown, three “compliance subgroups” emerged: sites in which compliance was high (n = 3), sites in which compliance broke down completely (n = 4), and sites in which the police maintained compliance during control conditions, but failed to follow experimental protocol during treatment shifts only (n = 3). Put differently, four sites gave officers complete discretion on when and where BWCs should be used. Three sites stripped officers of their discretion completely. Three more sites adhered to the protocol for control conditions, but allowed officers discretion as to when BWCs should be used in treatment cases, including at what point during the interaction BWCs were turned on in these treatment cases.

Table 1 Descriptive statistics in ten experimental sites and breakdown of sites based on treatment integrity failure

Results

As noted, there was an overall nonsignificant main effect (thus far) of cameras on police use of force based on analyses synthesizing results from the ten trials with data. However, this average masks instances where use of force decreased and others where it increased. Figure 1 shows the results subgrouped by the level of treatment integrity—remembering that the protocol-dictated “compliance” means officers did not have discretion about when cameras were turned on.

When officers followed the experimental protocol (“high compliance”), use of force decreased in line with expectations (SMD = −.346; SE = .137; 95 % CI −.614 to  −.077). However, in trials where officers were able to use full discretion in both treatment and control conditions (“no compliance”), the overall effect was nil (SMD = .009; SE = .070; 95 % CI −.127 to .146). This outcome can be expected, as there were no differences in the application of the treatment in either experimental or control conditions. Thirdly, when officers applied discretion during treatment conditions only but followed protocol during control conditions, use of force increased (SMD = .392; SE = .130; 95 % CI .136 to .647). These patterns are presented in Fig. 1 below. In terms of percentage change, high-compliance trials witnessed rates of use of force decreasing by 37 %. In trials with no compliance in treatment conditions, rates of use of force increased by 71 %.

Fig. 1
figure 1

Use of force rates: treatment integrity breakdown

Discussion and conclusions

The background for the trials described above was to assess whether the audial and visual digital recording of police–citizen interactions could act to deter police from using force and/or deter suspects from instigating forceful encounters. The results demonstrate that BWCs are able to achieve this objective, but only in situations where police relinquish some discretion on activating these devices. In fact, when police used discretion during treatment shifts, reported use of force increased. The causal mechanism for this increase is not clear, but we speculate that the selective activation of cameras by police is a corollary to situations that are already escalating in aggression. Furthermore, we also suggest that activating a camera during a tense situation may serve to increase the aggression of the citizen/suspect (and thus the officer).

Given our preliminary findings, we think that there is a clear route for amending these concerns, by improving the implementation of BWCs around the world: cameras should remain on throughout the entire shift—that is, during each and every interaction with citizens—and should be prefaced by a verbal reminder that the camera is present. We argue that the verbal reminder delivered by the officer wearing the camera provides a mechanism to remind that ‘rules of conduct’ are in play—common courtesy from officer and citizen for one, and potentially a legal requirement given the weight of privacy sensitivities in the public domainFootnote 1. Pushing this further, we argue that the verbal prompt is a mechanism that pushes mental processing of the situation towards the rational–deliberative mode of thought (Kroneberg et al. 2010; Thaler and Sunstein 2008), thus enabling the hypothesized deterrent effect of the camera to actually operate on officer and/or citizen. Future research on body-worn cameras ought to look into these theoretical as well as practical implications more robustly than presently available.