1 Introduction

In response to increasing rates of residential burglary in Germany over the past few years, police and policy-makers are attempting to determine new ways to stop or, ideally, reverse this trend. In this context, different strategies were applied and tested or are currently being developed in some federal states of Germany which make use of predictive policing (see also Egbert 2017, 2018; Gluba 2016; Belina 2016; Sommerer 2017). At least since the TIME Magazine ranked the application of predictive policing in Santa Cruz (US-CA) as one of the most important inventions in 2011, the term has attracted increasing attention during the last years in media as well as in academia. Predictive policing commonly refers to “a multidisciplinary, law enforcement-based strategy that brings together advanced technologies, criminological theory, predictive analysis, and tactical operations that ultimately lead to results and outcomes—crime reduction, management efficiency, and safer communities” (Uchida 2014). The interplay between those different aspects has also been described as “prediction-led policing business process” (Perry et al. 2013) with the authors emphasizing that in order to have an impact on crime, accurate predictions require adequate action. This results in a comprehensive feedback process consisting of four main steps (see Fig. 1), namely data collection (1), analysis (2), police operations (3) and criminal response (4). With recent advances in computer technology and the availability of huge data sets, the possibilities of structuring data, detecting patterns, visualizing data and making—more or less accurate—predictions of future crime have made considerable progress and what nowadays is listed as predictive policing has become a vast field with fuzzy boundaries. One approach to structure this field is to define the subject of predictions. Therefore, predictive policing commonly is distinguished in two subcategories; “location-based or geospatial predictive policing” and “individual-based predictive policing” (Sommerer 2017). Ferguson (2017) uses the subdivisions “place-based predictive policing” and “person-based predictive targeting”. The latter one may be further divided into methods of predicting offenders, predicting perpetrators’ identities or predicting victims of crime (Perry et al. 2013). The most prominent example of “person-based predictive targeting” is the strategic subject list (SSL) in Chicago where a risk score was assigned to previously arrested people, to predict a heightened risk of becoming involved in future gun violence either as perpetrator or a victim. In regard to the above-mentioned “prediction-led policing business process,” the SSL not only suffered from low predictive accuracy, but also from a lack of guidance in how to transfer predictions into police operations (Saunders et al. 2016). This finding stresses that the whole “prediction-led policing business process” has to be taken into account when examining and evaluating predictive policing—whether it is “place-based predictive policing” or “person-based predictive targeting”—because every strategy can produce its own unique difficulties during the various stages of the process.

Fig. 1
figure 1

The “prediction-led policing business process” (Perry et al. 2013)

While many police departments in different countries have been testing and using varying predictive policing strategies and statements from the police and media reports often sound promising, only few attempts (for example, Hunt et al. 2014; Saunders et al. 2016; Mohler et al. 2015) have been made to evaluate applied predictive policing strategies from an independent scientific perspective.Footnote 1 This has been criticized in the US context (Bennett Moses and Chan 2016; Perry et al. 2013; Ferguson 2017; Uchida 2014) as well as by German researchers (Gluba 2016; Belina 2016; Egbert 2017). This article contributes to close this gap by reporting findings of an evaluation study about a pilot run of applied predictive policing in the federal state of Baden-Württemberg in Germany (Gerstner 2017).Footnote 2 The pilot run was designed by the State Office of Criminal Investigations (Landeskriminalamt Baden-Württemberg), and the commercial predictive policing software PRECOBSFootnote 3 was used in the context of residential burglary. The pilot area included the urban districts (Stadtkreise) Stuttgart, Karlsruhe and Pforzheim and the more or less rural districts (Landkreise) Karlsruhe (LK), Calw and Enzkreis.Footnote 4 The police department (in the following PD) of Stuttgart is equivalent to the urban district; the other areas belong to the PD of Karlsruhe. The PDs are not spatially adjacent (see Fig. 2), and both of them have their own management. In order to achieve independent results, the pilot included an external scientific evaluation conducted by the Max Planck Institute for Foreign and International Criminal Law (MPICC). The evaluation period took place over the course of 6 months (31 October 2015–30 April 2016). The MPICC was not involved in planning and organizing the implementation of the pilot run, and there was, for example, no opportunity to include an experimental research design in the study. Despite this limitation, it was possible to gather many data to draw relevant conclusions. For the purpose of the evaluation, automatically generated data (e.g. crime incidents, GPS data from police cars) were analysed to obtain assessments of practicality and information concerning crime preventive aspects. In addition, semi-structured interviews with the police officers operating the software were carried out to get an estimate how the software performs in daily use. An online survey with more than 700 participants was conducted to gather information about experiences and assessments of patrol officers bringing predictive policing to the streets, as well as from the administrative level. The illustrated aspects in the article follow the above-mentioned “prediction-led policing business process” to give an overview on how predictive policing can be applied, which challenges must be faced besides making predictions and to what extent success was measurable.

Fig. 2
figure 2

Geographic data source: Federal Agency for Cartography and Geodesy of Germany, own graphic representation

Pilot area, police departments Stuttgart (right) and Karlsruhe (left). Stadtkreis (SK) = urban district, Landkreis (LK) = rural district.

2 Applied Predictive Policing and Application in the P4 Pilot

In Germany, applied predictive policing focuses mainly on place-based predictive policing in the context of residential burglaryFootnote 5 (for overviews, see Sommerer 2017; Egbert 2018), where usually a heightened probability of crime occurring in certain areas and a certain time is predicted. This approach is based on a large body of criminological theory and empirical studies. Recent efforts about spatiotemporal crime forecasting make use of concepts such as near-repeat victimization (Townsley et al. 2003; Mohler et al. 2011; Johnson and Bowers 2014; Schweer 2015) or risk terrain modelling (Caplan et al. 2011) but also rely on well-known theories like routine activity approach (Cohen and Felson 1979), crime pattern theory (Brantingham and Brantingham 1984) or rational choice perspectives (Cornish 1986). This has resulted in many applications of place-based predictive policing strategies, mainly located in the USA, but also in Europe and elsewhere. Some of the applications are in-house developments by the police (e.g. SKALA in Germany, Landeskriminalamt 2018a); others are commercially distributed by private companies (e.g. PredPolFootnote 6 or PRECOBS). The statistical methods used to predict crime range from more basic methods like logistic regression (e.g. Hunt et al. 2014) to approaches using machine learning or neural networks which fall in the scope of artificial intelligence (e.g. HunchLab, see Shapiro 2017).

As afore mentioned predictive policing has to be understood as a process. However, making predictions about future crime incidents remains the key element. The scientific community hereby agrees that predictions have to be considered as non-binary probabilities rather than certainties (e.g. Perry et al. 2013; Degeling and Berendt 2018). In accordance with this, PRECOBS does not predict single events but rather assesses the likelihood that certain areas will experience burglaries during a certain time span. For a better understanding of the evaluation study’s findings, it is necessary to give a short insight into what kind of data is used to predict crime and how predictions are made. The following section summarizes the main points; more information can be found in the detailed evaluation report (Gerstner 2017) or in Schweer (2015).

3 Data, Analyses and Near-Repeat Prediction with PRECOBS

The ability of PRECOBS to forecast probabilities of future burglaries is based on the observation that crime events often are followed by further events in spatial and temporal proximity (illustrated in Fig. 3, subgraph a). This phenomenon was observed for different types of crime and is called near-repeat victimization. The first empirical studies focused on residential burglary (Townsley et al. 2003; Bowers and Johnson Shane 2004; Bowers and Johnson 2005) but the near-repeat hypothesis also has been tested for bike theft (Johnson et al. 2008), theft from/of motor vehicle (Lockwood 2012; Johnson et al. 2009) and other types of crime (for an overview, see Johnson and Bowers 2014: 3244). Nonetheless, in recent research, residential burglary remains the most studied type of crime in the context of near-repeat victimization (Bernasco et al. 2015; Sagovsky and Johnson 2007; Nobles et al. 2016; Chainey and da Silva 2016; Ornstein and Hammond 2017; Piza and Carter 2017). The rationale behind the repeated burglaries lies in the assumption that burglars act rational and behave like an optimal forager (Johnson et al. 2009). This behaviour aims at increasing resources while decreasing risk and efforts, leading to a clustering of offences in space and time.

Fig. 3
figure 3

a Example of near-repeat burglaries; 9 offences from 5 years. The originator event (dot) and near-repeat events (squares) happened within three days (real time and distance data, location spatially blurred). b Fictitious example of a near-repeat area (dashed line) and according to fringe area (dotted line), c Fictitious example of an initial offence (dot) triggering an automated prediction and operational circle (solid line). Background maps by Stamen Design under CC BY 3.0

Although not every burglar acts rational and not every residential burglary triggers subsequent events, PRECOBS uses the near-repeat phenomenon for crime prediction. The basic idea is to identify events which are likely be followed by near-repeat burglaries. To prevent those potential follow-up events, subsequent target-oriented operational planning is applied. To configure the software for field operation, data from the past (usually five) years are being analysed. The procedure examines which attributes of residential burglaries are pointing towards near-repeat originators. For this purpose, the system primarily relies on the circumstances of an offence and the geographic location. So-called triggers, indicating to expectable near repeats, as well as anti-triggers, speaking against near repeats, are being identified and listed in a catalogue in the system. Triggers and anti-triggers cover the attribute groups; stolen goods, modus operandi (method of entry, tools, etc.) and locality (type of house, area, etc.). In addition, areas with a high proportion of near-repeat burglaries in the past are identified. Finally, a retrospective simulation study verifies in which of the “near-repeat affine” areas promising predictions are possible (Schweer 2015). The performance is measured via accuracy of predictions in the simulation. Promising areas, so-called near-repeat areas (Fig. 3, subgraph b), will be activated in the real-time operation. Each near-repeat area is surrounded by a fringe area (500 m distance), to include events in spatial proximity to the areas in real-time operation.

During everyday use, PRECOBS only needs a small amountFootnote 7 of data which stems from police investigations and is mainly recorded when a residential burglary is reported by the victims and the information is entered into the case processing system of the police (in Baden-Württemberg ComVor). Besides the attributes related to trigger criteria, the address,Footnote 8 date and time of the initial event are needed. The more precise the information is, the more precise the prediction will be. During the pilot, data were imported into PRECOBS three times a day via a direct interface to the ComVor-database. After the import, attributes of recent burglary events are compared with the trigger catalogues. If the attributes match with the trigger criteria and the burglary took place in a near-repeat area, an automated prediction is made. After a check for plausibility, the operators (the police officers operating the software) can accept or reject the prediction. When accepted, an alert is relayed to the local police department containing a PDF document with a map and additional information, e.g. about the initial incident and preferable times for patrol. The area which should be patrolled is called operational circle and contains a circular area around the originator (the burglary that triggered the alarm) with a radius of 500 m (Fig. 3, subgraph c). For this area, a heightened risk of near-repeat burglaries is assumed for 7 days. Although close to the near-repeat area, the fringe area cannot produce automated alerts. But after data import, the software provides the operator with an overview of burglaries in that sector which gives information about matching trigger criteria. In this way, the operator can check whether a burglary might be a trigger for near repeats and an alert can be created manually (operator alert). The PDF, the relay and what follows the alert is the same like for automated alerts.

As patterns of burglary differ over seasons, PRECOBS has separate configurations for standard time and daylight saving time. Furthermore, the geographical distribution of near-repeat areas is not stable over time, which leads to a recalibration of the near-repeat configuration with each new configuration. A recent study by Hoppe and Gerell (2018) supports the variability of geographic aspects related to near-repeat victimization and finds that some core patterns are stable across time, but also reports about substantial year-to-year fluctuations.

To sum up, PRECOBS is not designed to predict all burglaries but to predict potential burglaries following an initial event in spatial and temporal proximity. The method how predictions with PRECOBS are made isn’t based on a complicated algorithm and doesn’t include elements of machine learning or artificial intelligence. The logic behind an alert is not a “black box” and comprehensible for the police officers. This has been affirmed by the operators in semi-structured interviews. They also emphasized that they consider the software as a useful additional tool that requires the expertise of experienced police officers. Although some additional data are included in the system (e.g. type of street, microm-Geo-Milieus, etc.) to enrich the individual decision-making of the operators, the amount of data processed for prediction is small and originates from everyday police work. The illustration above roughly presented the part of data collection, data analysis and prediction in the “prediction-led policing business process” by Perry et al. (2013). Predictions are followed by police operations; this is presented in the next section.

4 Police Operations

During the 6-month evaluation period, 183 (PD Karlsruhe = 72, PD Stuttgart = 111) out of 206 predictions had been accepted and relayed as PRECOBS alerts. In the PD Stuttgart, there were more operator alerts than automated alerts. In the PD Karlsruhe, it was the opposite. In general, most of the alerts affected urban areas. This was caused by the fact that in rural areas only few of the offences fell into the relevant near-repeat or fringe areas (see Fig. 4). In the rural district of Calw (Landkreis) for example, only three out of sixty-nine events (4.3%) occurred in these areas. In contrast, in the urban district of Karlsruhe (Stadtkreis) 63.9% out of 274 offences were committed in relevant areas. For this reason, rural areas couldn’t benefit much from the applied near-repeat prediction method. This was also remarked by the operators who expressed scepticism about the use of PRECOBS in those areas.

Fig. 4
figure 4

Percentage and number of burglaries depending on spatial location in areas relevant for PRECOBS alerts. 31.10.2015–30.04.2016 Urban areas: Stadtkreis Stuttgart, Stadtkreis Karlsruhe, Stadtkreis Pforzheim. Rural/mixed areas: Landkreis Karlsruhe, Landkreis Enzkreis, Landkreis Calw

With the implementation of predictive policing, it is particularly important what follows the process of crime forecasting in order to achieve desired effects—or in other words: “Even the best analysis will do nothing to affect crime rates if it does not influence police practices” (Perry et al. 2013). In the context of the alerts during the pilot, it was specified to increase patrol in the operational circle during seven days following the trigger event (including that day) to certain hours and to carry out various measures (e.g. foot patrol, identity checks, vehicle checks) to prevent and counter (near-repeat) burglaries. The instruments used were the same as used independently from P4, and the difference was that the spatial and temporal focus was based on predictions made with PRECOBS.

Here, a central aspect was that alerts had to be relayed fast to the affected local police stations. During the pilot, this was ensured with only a few exceptions. The time span between data import and the relay of the alerts was usually short (median = 2 h). The operators reported that the quality of input data increased during the pilot, making fast predictions more feasible. A more challenging issue is that the time span between the trigger event and the relay of the alert might be affected by longer absence of the victims and a subsequent delay when detecting and reporting the offence. On average, the time between the offence and the relay of the alert was ca. 30 h.Footnote 9 Due to a skewed distribution and outliers, the mean value is biased and the median (20 h) is noticeably lower. In total, 61% of the alerts were relayed no later than 24 h after the trigger event. This could be considered as reasonable because the duration of the alerts was seven days and near-repeat events—when appeared—happened within 60 (median 50 h) hours on average after the initial event.

The transmission of the alerts into police operations was measured via two different methods. First, anonymized GPS data from police resources (mainly cars) were analysed to measure whether and how police density changes during alerts. Second, manually recorded information about operations during alerts was examined to give a summary about what kind of operations were carried out. Both measures were later used in the evaluation for correlation and regression analyses. The manually recorded data were directly entered into the case processing system by the patrol officers and actions were assigned to the identification number of an according alert. On average, 48 h of patrol were used with 2.8 patrol officers per documented action. Besides patrolling in vehicles also foot patrol was applied by uniformed or plain-clothed officers (Zivilbeamte). Spatial- and temporal-focused identity checks (mean 16.5) and vehicle inspections (mean 9.4) were carried out. Occasionally, the resident population was contacted—this also happens in areas not affected by PRECOBS alerts at the beginning of the dark winter months. A detailed analysis of these measures can be found in the evaluation report (in German, Gerstner 2017, 32–35).

The analysis of anonymized GPS data allows a comparison between states of active and inactive alerts. Due to the anonymization, the preparation and analysis of the data a workaround had to be applied. As the timestamps of the GPS data were anonymized to time slots of 5 min and the signal was sent about every 100–200 m, it wasn’t possible to calculate police density on the basis of a timescale. Instead, for each cell of a 350 × 350 m sized grid spanned across the pilot area, the police density was measured as a weighted frequency of sent signals for each day of the pilot. The weight reflects the number of sent signals during a 5-min time slot—a proxy to measure the speed of the car—and slow moving or parked vehicles were assigned a larger weight. To measure the changes in police density during active alerts, the operational circlesFootnote 10 were intersected with the grid and the increase was measured as the proportional difference between the density during the alert and the baseline density (mean for days without alerts). Figure 5 gives an example of how density changes during an alert. Subgraph-1 shows the density before the trigger event. In A1a burglary happened and the alert was relayed on the following day. A2 to A7 represent the phase with the active alert and the graphs reveal an increased density during that period and a decrease after the alert. For a more compact description and further analyses, for each alert the deviation to the basic rate (phases without alert) in the operational circle was calculated. As values can be strongly affected by outliers (e.g. situations including riot policeFootnote 11), the mean value for the grid cells intersecting the operational circle was calculated as the arithmetic mean as well as with more robust measures. Looking at the 5% trimmed mean, it can be shown that in 6% of the alerts no change or a slight decrease in density was observable. The remaining 94% showed an increase during active alerts. On average, the density was 73% (median 49%) higher compared to the basic rate (phases without an alert). In the police department of Stuttgart, the increase was slightly higher on average than in Karlsruhe (Fig. 6).

Fig. 5
figure 5

Sources: VIADUX-location data, PRECOBS-database P4

Example of increase in police density during an active alert, own calculation.

Fig. 6
figure 6

Sources: VIADUX-location data, PRECOBS-database P4

Distributions of police density increase in  %, accepted and relayed alerts, own calculation.

5 Criminal Response

Subsequent to police operations put in practice that are based on predictions, the question arises if these operations take effect. As the relation between costs and benefits of data-driven policing—and especially predictive policing—has always been hard to illuminate and up to now this “literal million dollar question remains unanswered” (Ferguson 2017). The evaluation of the P4 pilot did not include any analysis of financial aspects nor an experimental research design to determine the amount of prevented burglaries and therefore cannot answer this question. Nevertheless, detailed analyses of crime-data allow for relevant conclusions to shed some more light on the potential benefits of predictive policing strategies.

5.1 Number of Cases

A common practice used to give information about the effects of predictive policing—especially in media reports—is comparing the number of cases during test periods and the year before. However, this method can result in wrong assessments for different reasons. First, burglary rates have a natural fluctuation and are influenced by multiple factors. This becomes more important with smaller units. Second, predictive policing is often a response to increasing crime rates which are heading towards a peak. This peak might be followed by a decrease, which isn’t connected to police operations. Vlahos (2012) illustrates this with numbers reported in the context of Blue CRUSH in Memphis and shows that different results are to be expected with different baseline numbers (e.g. a mean value of 5 years instead of one single year). Nonetheless, comparing the number of burglaries with the years before gives a first impression of how cases developed during the trial period and the years before (Fig. 7). In the PD Stuttgart, the cases dropped clearly during the trial period (15/16). Whether this was related to PRECOBS is not clear, as this was the same in the preceding period (14/15). In the PD of Karlsruhe (total area), the numbers remained stable compared to the preceding period. This was mainly caused by a high burglary activity during November and December 2015 in the city of Karlsruhe (Stadtkreis) and the rural area Karlsruhe (Landkreis). It also remains unclear whether the numbers could have been higher without PRECOBS. In the remaining parts of the PD Karlsruhe, the numbers decreased. Despite some positive indications, the case numbers cannot be used as a definite estimation about the effectiveness of predictive policing. An appropriate method to measure the proof of benefit would be to implement a randomized controlled research design. This has been widely applied in the context of hot-spot policing (see, e.g. the meta-analysis by Braga et al. 2014). In the context of predictive policing, such studies are hard to conduct not least to the fact that researchers have to deal with spatially and temporally varying locations of crime instead of fixed hotspots. Only few studies have implemented such an experimental (Mohler et al. 2015; Hunt et al. 2014) or quasi-experimental design (Mastrobuoni 2014).

Fig. 7
figure 7

Data source: ComVor-database LKA BW, own calculation

Number residential burglaries in subdivisions of the pilot area. Evaluation period of pilot (31 October 2015–30 April 2016) compared to equivalent period in the years before.

5.2 Near-Repeat Patterns

As mentioned above, predictive policing with PRECOBS is designed to prevent near-repeat events or to disturb local series of burglary. This leads to the hypothesis that near-repeat patterns should be less prominent in the evaluation period compared to the years before. To test this hypothesis, police data were analysed with the “Near Repeat Calculator” Software (Ratcliffe 2008a, b). The software uses Monte Carlo Simulation to test whether observed near-repeat patterns deviate statistically significant from random patterns. The following analysis only refers to near repeats with a spatial bandwidth of 500 m and 7 days (in accordance with the PRECOBS alerts). Table 1 presents findings for the police department of Stuttgart (which corresponds to the “Stadtkreis” Stuttgart), the complete police department of Karlsruhe and its subregions Karlsruhe (“Stadtkreis”) and Pforzheim (“Stadtkreis”). In each region, near-repeat patterns for the total area (all burglaries) and the subregions of the near-repeat areas were analysed (burglaries within near-repeat areas). The results from the pilot period (W15–16, November 2015 until March 2016) are compared to the equivalent time span in the years before. The numbers in cells denote how great the chance of another burglary is, as if there is no discernible near-repeat pattern (Ratcliffe 2008b, p. 8), statistically significant values are marked with asterisks. This approach allows not only to test independently whether an impact of predictive policing is indicated, it also allows to test whether the near-repeat areas defined in PRECOBS are justified.

Table 1 Results of near-repeat analyses, overall areas and near-repeat areas

In both of the police departments (SK Stuttgart and PP Karlsruhe), significant near-repeat patterns were found for each time period including the pilot. With a change to the near-repeat areas—the areas which benefit most from PRECOBS—it appears that in these areas the values are noticeably higher compared to the total area. This corresponds to the expectations and points to the fact that the areas defined by the developer of the software do have their legitimacy. More relevant is the fact that no significant near-repeat pattern can be found in the near-repeat areas for the evaluation period. This has been observed in previous years. This also applies to the city of Pforzheim (SK Pforzheim) if considered separately. In this case there was no significant pattern in the period W14–15, but for the evaluation period the coefficient is below zero—this means that there were less near-repeat events than in a random distribution. With the analyses of near-repeat patterns, one has to be careful because the existence of near repeats is to some degree dependent on the total number of events and with decreasing numbers the chance of a near repeat is less probable. In this context, it is remarkable that in the city of Karlsruhe (Stadtkreis) the near-repeat ratio within near-repeat areas is smaller than in the year before (and also not significant), even though the number of burglaries has increased by more than 100 cases (+ 58%, see Fig. 7) in the trial period.

The results described above suggest that inside the near-repeat areas certain mechanisms might have had an impact on near-repeat burglaries which probably were prevented due to PRECOBS alerts. Nevertheless, as no randomized controlled trial was applied, causality remains unclear, but there are some correlations between the intensity of measures taken during an alert and the number of near-repeat events following the trigger event. Some examples are shown in the following section.

5.3 Near Repeats in the Context of PRECOBS Alerts

The above-described measurement of police interventions via GPS data gives an impression of how alerts differ from each other in intensity. This leads to the hypothesis that trigger events followed by alerts with high intensities of police presence/operations will experience less near-repeat events. The dependent variable in the following analyses is the number of subsequent burglaries to an alert-triggering offence in the operational circle during the active alert (7 days including the trigger event). To account for “near misses”, the radius of the circle was set to 600 m for these analyses. The evaluation included several approaches to test this assumption, and two examples are shown. Table 2 reports about bivariate correlations between the number of near-repeat offences during an alert and three independent variables are shown. A minimum of one near repeats was observed in 25% of the cases. The time span between the trigger event and relay of the alert does not correlate with the number of near repeats, which might be explained by the fact that alerts were distributed fast (see above). For the increase in police density measured via the GPS data, a significant negative bivariate correlation was found for the police department of Karlsruhe; here a stronger increase is associated with fewer near repeats. Another important finding was that potential predecessing events, which were possibly related to the originator event due to spatial and temporal proximity, correlated negatively with the number of near repeats. The correlation was significant for the total area and significant on the 10% level in the PD Karlsruhe. In multivariate regression analyses, no significant effects were found, except for the preceding events in the total area of both PDs. A more detailed description of the results and an analysis with a wider definition of the dependent variable can be found in Gerstner (2017).

Table 2 Correlation between near-repeat events within alerts and selected variables.

In the police department of Stuttgart, there were no correlations found between the number of near repeats and the police density measured via GPS data. But in this area a correlation between manually recorded police activityFootnote 12 and the number of near repeats was found. Here, an index (PCA factor score, KMO = 0.65/eigenvalue = 2.40/explained variance = 0.51) including the variables “sum of operating hours”, “number of identity checks”, “number of vehicle controls”, and “number of direct contacts to residents” was created, which gives a summary of how alerts differ in intensity of patrol activity. Although the factor score cannot be interpreted in terms of a stated number, the intensity between the alerts can be compared. With a higher intensity (scoring high on most of the variables included in the factor score), less near repeats were to be expected (Spearman’s rho = − 0.21, p < 0.05, N = 100). This was also found in a multivariate framework where the dependent variable was the number of near repeats following an alert (negative binomial regression, Table 3). Significant effects can be reported for the “intensity” (b = − 0.46, p < 0.01) and the “number of potentially preceding events” (b = − 0.89, p < 0.1). The remaining predictors, “time between the originator and the relay of the alert” and the “ratio of patrol officers in plain clothes” have no effects on the outcome variable. With a change of one standard deviation in intensity, the estimated number of predicted near repeats changes by − 0.18 on average (AME,Footnote 13 p < 0.01). With a change in “intensity” from the 10th percentile to the 90th percentile of the distribution, the amount of model-based estimated near repeats changes by − 0.78 burglaries. Within the middle 50%, the number of predicted events changes only by − 0.23 (25th percentile = − 0.69, 75th percentile = 0.63). So this effect seems to be rather small. As nonlinear regression is affected by inherently multiplicative (or conditional) effects (for further explanation, see Gerstner and Oberwittler 2018; Oberwittler and Gerstner 2014), there are moderating effects between “intensity” and the “number of potentially preceding events” observable (Fig. 8). If there are no preceding events, the effect of the intensity of police activity appears to be important (dots). With one preceding event the effect becomes important (diamonds), and with more than one preceding events (triangles and squares) the effect is practically not present. The results are based on a small sample size and only a short period of time, that’s why they have to be treated with caution. Nonetheless, these findings point to the importance that detecting small series of burglaries at an early stage can improve crime prevention. Future research about applied predictive policing should follow this issue.

Table 3 Coefficients and average marginal effects from multivariate model (negative binomial regression), dependent variable: near repeats in operational circle (600 m/7 days).
Fig. 8
figure 8

Data source: ComVor-database LKA BW, PRECOBS-database, own calculation

Predicted values for different values for “intensity of police activity”, conditioned by “number of preceding events”.

5.4 Assessment by Police Officers

In the “prediction-led policing business process”, police operations ideally alter the environment (see Fig. 1) by decreasing crime. What has to be taken into account is that predictive policing strategies also might alter the environment of police officers. Therefore, the evaluation study included semi-structured interviews with the PRECOBS-operators and an online survey with more than 700 police officers to assess how predictive policing with PRECOBS influenced daily police work. The following section reports on some important aspects; detailed information can be found in (Gerstner 2017).

5.4.1 Operators

PRECOBS-operators are officers who operate the software, evaluate automated predictions, manually generate predictions and relay the alerts. Before the pilot they already worked as analysts in the field of residential burglaryFootnote 14 and therefore are trained to survey present occurring crime. Their experiences with the predictive policing software and their assessments during the pilot were captured with six expert interviews (two in Karlsruhe, four in Stuttgart) following a semi-structured interview guide. Overall, the operators showed a very consensus view and their basic opinion was that PRECOBS is a useful supplement, especially during phases with a high load of burglaries. According to this, the application in rural areas (police department of Karlsruhe) and operation over the weekends (both departments), with only few burglaries, was perceived sceptically. The software was unanimously rated as user-friendly, even though there were some initial difficulties. The support offered by the developer was assessed as good. Asked about the transparency of the automated alerts, the operators emphasized that in most cases these were comprehensible and in their opinion this is important because the software needs an experienced officer to work properly. Finally, the operators appreciated the additional tools (which were not part of the evaluation study) implemented in PRECOBS for the analyses of local crime activities.

5.4.2 Online Survey

An online survey with 730 police officers made it possible to capture how predictive policing with PRECOBS was perceived by patrol officers and officers from the middle and upper management level. The survey was designed with a convenience sample, where the invitation to participate was sent via e-mail to officers who due to their work (possibly) got in contact with PRECOBS alerts. It was carried out in May 2016 after the evaluation period. The survey asked about information and knowledge prior to the start of the pilot, practical experiences and personal assessment about the pilot’s outcome. Not all of the participants had been in contact with PRECOBS alerts (strategical or operational). That caused a large number of missing values and “don’t knows” in some questions and the data cannot be seen as statistically representative. However, it gives information about several interesting aspects.

One important finding was that at the end of the pilot predictive policing was a divisive issue. Figure 9 shows the distribution of an index measuring the perceived value of predictive policing. The index is a Likert scale consisting of seven items asking about the efficacy of predictive policing in the context of P4 (Cronbach’s Alpha = 0.912/wordings see caption of graph). The total distribution shows two peaks on the left and right sides of the midpoint. In total, it appears that about one half of the respondents sees a promising concept in the applied form of predictive policing, the other half has the opposite view. Split by police department the distributions differ slightly. In Stuttgart the polarized opinions become more obvious, whereas in Karlsruhe a larger proportion of the respondents has a more positive perception. This remains valid when asking about continuation. In the police department of Karlsruhe, 62% of the respondents agreed to a further use of the software. In the police department of Stuttgart, only 41% agreed to this. Besides the regional differences, there were differences in the occupational status of the respondents. The highest agreement was found in the group of the higher management level (65%, höhere Führungsebene), followed by the respondents of the middle management level (57%, mittlere Führungsebene). The lowest agreement was found in the group with mainly patrol officers (46%, Sachbearbeiter).

Fig. 9
figure 9

Data source: P4 online survey

Perceived value of predictive policing. Mean score out of 7 items (Cronbach’s Alpha = 0.912): “Predictive policing (PP) is a useful addition to regular police work”, “PP is more annoying than useful”*, “PP is a suitable auxiliary tool for targeted planning”, “An added value is not given with PP”*, “In my opinion PP remains gazing into a crystal ball”*, “It is worth thinking about using PP with other offences”, “The financial resources of the pilot could have been used better elsewhere”* (n = 552, *reverse coded, PP Stuttgart/Karlsruhe = police departments).

The differences in the perceived value and the agreement to a continuation of predictive policing with PRECOBS might not only be explained by the grouping above. Another interesting finding was that officers who were confronted with lots of alerts during their everyday work disagreed more often with a continuation (Fig. 10).

Fig. 10
figure 10

Data source: P4 online survey

“The software should be used after the pilot” in relation to the frequency patrol officers had to serve PRECOBS alerts (n = 430).

This is possibly caused by the fact that other work had to be neglected during active alerts (Fig. 11), which was often the case in the PD Stuttgart. Another explanation might be that success is not immediately measureable. Predictive policing with PRECOBS aims at preventing burglaries by deterrence rather than catching criminals. Therefore, it is hardly surprising when frequent PRECOBS alerts break established routines, and therefore, negative perceptions are reported by some of the respondents.

Fig. 11
figure 11

Were there ever staff shortages because of alerts during the pilot? (n = 602)

These considerations are also reflected in the results reported in Table 4. Residential burglary can vary from year to year, especially when looking at small units in space. Asked about the annual variation in the area their work takes place, the respondents showed lower agreement to benefits of predictive policing (column A) when they reported strong variations. These correlations can be found in both PDs and indicate that it is hard to assess the value of predictive policing without any references. Otherwise, if respondents experienced that the prevention of residential burglaries in the trial period required less hours of operation than the year before, their perceived value of predictive policing and agreements to benefits of this measure where higher (column B). This result was only found in the PD Stuttgart and has to be handled carefully. On the one hand, the decreasing number of burglaries in Stuttgart might actually require a lower amount of hours for burglaries. On the other hand, only 34% of the respondents in Stuttgart agreed to this item and the correlation also means that those who did not experience a lower workload with burglaries had more negative perception of predictive policing. In addition, a lot of the respondents weren’t able to give an estimation to that item. Nonetheless, these results point to the importance of assessing processes which possibly generate friction among actors an implementation process.

Table 4 Bivariate correlations, perceptions of trends related to residential burglary and perceived value of predictive policing

6 Summary

After the evaluation of the P4 pilot project, it is still hard to assess whether or to which extent applied predictive policing can contribute to the reduction in residential burglaries. Nonetheless, the study offers relevant insights into different aspects of a “prediction-led policing business process” (Perry et al. 2013) put into praxis. The applied software (PRECOBS) focuses on “near-repeat prediction”. This means that only residential burglaries potentially following an initial event are predicted. For the calibration of the software as well as for predictions, only a small amount of data are needed which primarily stems from police records. Therefore, data collection is mostly part of regular police work. The method of prediction was comprehensible for the police officers who operated the software and they considered the software as a useful additional tool. The integration of the software into everyday business worked without much difficulty, and police actions following an alert-triggering burglary took place in a timely manner. The relay of PRECOBS alerts resulted in an increase in patrol activity which was measured via anonymized GPS data from police cars. Besides patrolling in vehicles, the police operations included foot patrol and spatially and temporally focused identity checks and vehicle inspections as well as occasional contact to resident population. The main goal of the applied strategy aimed at preventing burglaries by deterrence rather than catching criminals.

Regarding the crime reducing effects induced by police operations following predictions about near-repeat burglaries, it must be noted that the results of the study have to be treated with caution due to the shortness of the evaluation period, a small size of trial areas and in particular the lack of an experimental research design. Therefore, a comparison of case numbers with previous periods or other regions only can give limited insights. Though, for example, the police department of Stuttgart experienced a decrease in burglaries during the trial period, this cannot be attributed solely to the new strategy as the decrease was also present in the year before. Nonetheless, with the analyses of near-repeat patterns it could be shown that certain mechanisms took effect as no such patterns were found during the trial period in those areas destined to profit most from the software. This was also found in the urban area of Karlsruhe, where the total number of burglaries increased during the trial period. With multivariate regression analyses, it was analysed how different intensities of police operations during alerts can affect the number of near-repeat events. Indications were found that higher intensities had a decreasing effect on the probability of near-repeat events. It was also indicated that the number of predecessor burglaries which were possibly related to the alert-triggering event due to spatial and temporal proximity had a negative effect. With more events preceding the originator, the probability for near-repeat events decreases. This points to the importance of techniques which can identify a series of residential burglary incidents. Despite these positive findings, it has to be noted that the size of the effects appeared to be moderate and the results have to be treated with caution in regard to their robustness. Nonetheless, the findings of the evaluation study provide connection points for following research in the field of place-based predictive policing which is recommended.

Regarding the integration into the working environment of police officers, the small group of officers who operated the software assessed it as a useful additional instrument—especially during times with a high load of residential burglary. In a larger group, including patrol officers, the perceived value of predictive policing with PRECOBS was a divisive issue. In particular, officers who were confronted with many alerts tended to disapprove of a continuation of predictive policing. This might be due to perceived additional workload but also due to the fact that the preventive effects of deterrence cannot directly be perceived, especially with varying amounts of burglaries in local areas. The acceptance and assessments of predictive policing within police forces also provide additional links for future research.

Since August 2017, the software is used within a second trial period in the same pilot area Baden-Württemberg, in order to get a higher degree of certainty about benefits of predictive policing in the context of residential burglary. The temporal extended trial period includes a further scientific evaluation by the Max Planck Institute for Foreign and International Criminal Law in Freiburg. In addition to the extended time span, an experimental research design is applied to increase the informative value of the results.