The Evidence-Based Investigative Tool (EBIT): a Legitimacy-Conscious Statistical Triage Process for High-Volume Crimes

Based on the evidence at the end of a preliminary investigation of minor, non-domestic assault and public order cases, how accurately can the likelihood of a sanctioned detection be predicted for triage decisions while maintaining high awareness of legitimacy issues? Investigative records on assault and public order offences recorded by Kent Police, with a case-control sample of 522 randomly selected detected cases and 482 randomly selected undetected cases, a test sample of 931 cases, and an additional 7947 cases for testing the model on all eligible cases in the force area for the initial six months of its use. A case control comparison between solved and unsolved cases produced a logistic regression model that was used to predict investigative outcomes in both the test sample and the complete tracking of its use in investigative operations. Eight elements of evidence available by the end of the preliminary investigation were found to predict whether a sanctioned detection would result from further investigation: (1) victim supports police prosecution, and evidence includes (2) a named suspect, (3) a cooperative witness, (4) CCTV evidence, (5) confirming police testimony, (6) forensic evidence, (7) a connection to other cases and (8) a report of the crime to police less than 28 days after the incident occurred. When the EBIT was calibrated to identify only the 31% of cases most likely to yield a detection from further investigation, the model correctly forecast 97% of cases that would not be solved, producing only 3% false negatives. It also reduced the false-positive rate from 73 to 22% in cases that did not lead to a sanctioned detection. A case control analysis of solvability factors at the end of a preliminary investigation can identify almost all of the cases that are likely to be solved, even while the model predicts that the majority of all cases at that point will not be solved.


Introduction
The Evidence-Based Investigative Tool (EBIT) is a multi-stage early case review system deployed by Kent Police (UK) to assist with the allocation of additional investigative resources on minor, non-domestic assault and public order offences. By predicting investigative success with EBIT, police can free up more time for preventing serious crime or for investigating more solvable cases by reducing wasted resources applied to cases at the point they have clearly been established to be unsolvable (Sherman 2018). The tool aids decision-making by recommending an allocation decision based on an evidence-based actuarial solvability assessment by the use of a logistic regression model, after which a two-step review of the case applies a structured professional judgement and public interest assessment to such issues as victim vulnerabilities and offender propensities to reoffend. This process was designed in Kent Police and is shown in Fig. 1. Each crime type has its own bespoke statistical model constructed to maximize accuracy. This article focuses on the model built to predict investigative outcomes for reported crimes of minor assault and public order (EBIT APO ).
Minor non-domestic assault and public order were chosen as the initial group of crimes for building an EBIT for a number of reasons, including their low solvability and their high impact on police resource allocation. The aim of EBIT is to improve police decision-making after providing an initial investigation for all volume crime. The greatest potential value of the tool is for crimes that occur in the highest volume. The need for decision support with an EBIT has therefore grown substantially as minor assault and public order offences have increased.
Assault and public order offences have experienced a substantial increase levels recorded by police, having doubled over the past 5 years across England and Wales (Fig. 2). This substantial uptick in volume, due primarily as a result of changes to crime recording rules, has not altered absolute numbers of sanctioned detections (McFadzien and Phillips 2019). And because those absolute numbers, i.e. the number of cases receiving a charge or a caution, have barely changed, the proportion of cases proving to be solvable cases has dropped markedly. Complicating matters further is the disproportionate investment of investigative resource required to untangle the circumstances of these cases. The direct contact between complainants and suspects in these cases means that a large proportion of them will require an arrest and interviews in order to lead to detections (Burrows and Tarling 1987;Smit et al. 2004;Thanassoulis 1995).
A final reason this category was chosen is because these crimes, while numerous, comprise a very low proportion of total crime harm in any community (Sherman et al. 2016). Even if higher detection rates succeeded in reducing such low-harm events, this would cause very little harm reduction in communities experiencing high levels of knife crime and serious gang violence. Thus, the benefit to the community is disproportionately low in relation to the large investment required to take each case beyond its initial investigation period.
All of these reasons taken together mean that when police use resources disproportionately to harm while achieving few sanctioned detections, it is in the public interest to review the resource allocation policy. That is what Kent Police decided to do in 2017. The alternative policy Kent Police selected was to develop and test a solvability framework to target those cases most likely to get a positive outcome, as a supplement to other considerations for continuing cases beyond initial investigations.
This decision was made, in part, because a positive judicial outcome is not the only issue police should consider when choosing the best course of action about an alleged offence. Other features of the incident alleged to have been criminal are also important. Safeguarding vulnerable victims and targeting prolific offenders, for example, are core policing functions that need to be considered. EBIT addresses these by applying a structured professional judgement assessment to those cases that are not automatically allocated for further investigation. EBIT users are required by policy to establish the circumstances of the case for features such as the presence of a position of trust between the parties, victims who have suffered prior victimisations and suspects who are known to police due to prior offending. If any such features are identified, the case is sent to a supervisor for a professional judgement assessment. This assessment then takes into account both the initial EBIT solvability score and the features of public interest that have subsequently been identified in low-score cases. Only after reviewing both the solvability score and the qualitative dimensions of each case do the investigative managers determine the best course of action beyond the initial investigation.
The options for such actions range from a police investigation, a referral or signposting to appropriate alternative services or case closure. The core contribution of EBIT is to provide a reliable and objective means to focus initially on case solvability. Where there are identifiable characteristics of a case that are known early in the investigation that correlate with a likely sanctioned detection, the EBIT result supports a decision to continue the investigation. It is only when the characteristics predict further police time to be wasteful that EBIT is used to trigger a police professional's review of that arithmetic assessment. The decision to produce an evidence-based statistical assessment of the likelihood of success is not a replacement of humans with a robot; it is rather a tool for advising experienced police professionals what has been found useful in prior cases and in prior studies.
Well before the Kent EBIT was developed, there was a modest but promising amount of research into solvability factors for violent offending. Most of the attention in solvability research focused on more harmful violent offences and acquisitive crimes (Olphin and Coupe 2019). While early studies had some limited success (see, e.g. Greenberg et al. 1977), there have also been a number of more recent studies that successfully identified solvability factors that are relevant in the context of EBIT (Peterson et al. 2010;Peterson et al. 2013); (Roberts 2008). The EBIT strategy builds on these studies in the context of UK policing culture, in which substantial value is placed on fair procedures across all complainants and the legitimacy of making decisions that benefit communities while being fair to individuals.

Research Questions
The specific research question for developing EBIT in mid-2017 was this: Based on the evidence at the end of a preliminary investigation of minor, nondomestic assault and public order cases, how accurately can the likelihood of a sanctioned detection be predicted for triage decisions, while maintaining high awareness of legitimacy issues?
Once this question was answered, and a decision to implement EBIT was made, the further research question for this article was this: Once the EBIT was applied generally in investigative operations for the categories of cases identified, what was the ongoing accuracy of the model?
How much wastage of police resource did the EBIT model reduce?

Phase 1: Building the EBIT Model
A retrospective case control design was used to identify solvability factors for the actuarial assessment component of EBIT. This design has many benefits for criminological research when the frequencies of binary outcomes are unequal (Loftin and McDowall 1988), as is the case for detections, which comprise a small percentage of all investigative outcomes. The data for developing the EBIT predictive model from a "training" sample consisted of 1004 cases drawn from eligible assault and public order offences that occurred in Kent Police, a large county force in the UK in 2016, of which there were over 20,000. To be eligible, the case was not to have been recorded as a hate crime, domestic abuse or an assault case likely to be charged above the common assault charging standard. The random sample was chosen from the following offences: The dependent variable was case detection. A random sample of cases were selected from 2804 detected cases that were reported in the 2016 calendar year, and the sample was stratified across the 12 months to account for any seasonal variability. All cases meet the United Kingdom's evidential charging standard for these outcomes: "charge", adult and youth cautions, community resolutions and penalty notices for disorder (PNDs). The undetected cases were a similarly stratified random sample from the 21,210 cases in Kent in 2016 with outcomes that did not meet the charging threshold after an investigation and were filed with no further police action. The final sample consisted of 482 detected cases and 522 undetected cases. The independent variables were informed by prior research (such as ;Eck 1979;Greenwood 1970;Isaacs 1967;Olphin and Mueller-Johnson 2019) and the availability of those variables in police records. They fell across three categories.
Case variables, including outcome, CCTV evidence, forensic evidence, multiple offences, cooperative witnesses, report delay, skeleton reports (cases that have virtually no information, often reported by third parties), police evidence, presence of a weapon, reporting party and location of offence Victim variables, including victim support for prosecution, alcohol or drug consumption, prior offending, age and gender Suspect variables, including a named suspect, suspect's alcohol or drug consumption, prior offending, age and gender Each case was screened for eligibility. Cases with a charging level above common assault or where the victim was a police officer were excluded from the analysis. Ultimately, 97 cases were excluded with replacement: 81 of these cases were removed due to being assault with injury cases deemed chargeable above common assault standard, and 16 cases were removed due to having police victims.
The independent variables were then either extracted by manual review by experienced detectives or data mined from the police intelligence system. Police actions and information gleaned after the point of allocation to divisional officers were discarded or treated in the negative to mimic the real-world application of the tool. In other words, the procedure limited the search for correlates to those that fit the chronological sequence of what was known at the end of a preliminary investigation, to the exclusion of what was discovered in a subsequent investigation.
Once the dataset was compiled, analysis of the cases involved the construction of a logistic regression model in R (R Core Team 2017) using the caret package (Kuhn 2008) to identify useful solvability factors. This involved constructing a model in which all variables were initially included, then refining it by removing nonstatistically significant results and factors that would reduce the legitimacy of the tool.
The public interest questions were developed by experienced Kent Police officers. The questions included the following dimensions: The presence of victim warning markers indicating involvement in prior crime or victimization The victim being a repeat victim to the same suspect (ever) The victim being a repeat victim of any crime in the past 12 months The victim being an 'enhanced victim' under the United Kingdom's victim code A position of trust between victim and suspect The presence of a named suspect also triggered a number of public interest questions: Does the suspect have a criminal history? Is the suspect currently on bail or prison licence?
The answers to all these questions were scored using professional judgement. Answering some of these questions in the affirmative requires an immediate review, while others require two positive answers to trigger a review.

Phase 2: Testing the Model
Once the solvability factors were identified (see "Findings" section below), a new random sample of 931 cases (without replacement from the training sample) was extracted from the 2016 assault and public order dataset. The new random sample resulted in the proportion of detected cases being (broadly) proportional to the prevalence rate of detection in the underlying population of cases at approximately 13%. From this new sample, the eight EBIT solvability factors identified from the case control analysis were applied and its accuracy tested.

Phase 3: Tracking EBIT in Operational Application
EBIT was built into daily police business across Kent policing in January 2018, following a 3-month local pilot. The launch was implemented with software supporting users in the force control room undertaking desktop reviews of all preliminary investigations. The software prompted the users to answer the EBIT solvability questions. If the case was deemed unlikely to be solved, the software then prompted them to answer the public interest questions. Once each case was processed, the software advised the user to either: Allocate the case to further investigation Close the case pending further evidence Send the case for further review by a supervisor (Fig. 1) The software does not display the solvability scores to the user, but they are stored in a database along with the answers to the EBIT questions.
Over the first 6 months of 2018, EBIT was applied 7947 times to eligible cases. In the "Findings" section below, the final outcomes of these cases are analysed and compared with the solvability score they received.

Phase 1: Building the EBIT Model
The case control analysis of 1004 cases (of which 482 had led to detections) identified nine factors as showing statistically significant differences between detected and undetected cases (Table 1). These included case variables of having a named suspect, the victim supporting a judicial outcome, the presence of CCTV evidence, forensic evidence and police evidence; the case was linked to multiple cases, and the primary reporting is police and a delay of over 28 days in reporting the incident. There were also two factors that were close to p = 0.05: victim alcohol consumption and the presence of cooperative witnesses.
This was then further refined with non-statistically significant variables sequentially added and removed utilizing comparative BIC and AIC values (Burnham and Anderson 2002).
The final model, shown in Table 2, contains eight statistically and operationally relevant variables. Two variables were weakly statistically significant but omitted from the final model. These were the reporting party and alcohol consumption by the victim. While interesting from a solvability perspective, these were excluded from the final model because of threats to legitimacy. It is highly unpalatable to respond to intoxicated victims differently to sober ones and equally to privilege the information provided by some informants over others.
Of the eight solvability factors identified, seven indicate increased detection; extended delays in reporting to police produced lower odds of detection. The odds are summarized in Fig. 3, in which an odds ratio of one indicates that the presence of a factor does not affect the odds of the outcome (Szumilas 2010). All eight variables are statistically significant as the error bars do not cross the centre vertical line denoting "OR = 1".
In order for readers to be able to replicate the logistic regression formula for EBIT, we display it in algebraic form below. When calculated, the regression yields an EBIT solvability score between 0 and 1. This model was built into a bespoke software solution (built by Kent Police) that allows EBIT users to simply answer the eight questions and be provided with guidance on case solvability. This solution was deemed a preferable alternative to managers performing manual calculations or looking up the results in large tables displaying the outcome based on all permutations:   In order to get an initial understanding of optimal accuracy, the model was tested on the training sample before an independent sample was obtained. This was done because data collection for EBIT is resource intensive. The results of this analysis are shown in Table 3. Of the 1004 cases, the model accurately predicted 827 of them at the default 0.5 threshold. The sample is case controlled and does not represent the actual proportion of detected cases. Nonetheless, the 13% false-negative rate was adequate, particularly as any police force can adjust the model score at which cases are immediately allocated from the baseline of 0.5 to reduce the false negative rate further (at the cost of more false positives).

Phase 2: Testing the Model
A robust assessment of the accuracy of EBIT, however, requires an independent "test" sample, which does not use any of the cases for developing the predictions in the "training" sample. The results testing the model on the independent "test" sample of 931 randomly selected cases are displayed in Table 4. The overall distribution of this new test sample yielded a detection rate proportional to the force-wide detection rate of approximately 15%. In this scenario, the EBIT model correctly predicted that 66% of the cases would not lead to a sanctioned detection. Only 26 cases or 2.8% of the sample were false negative ("harmful") errors by which the model failed to predict an actual detection. Interestingly, under this scenario, the model only allocated 289 cases (31%) and had a detection rate of allocated cases in this sample of 39%-almost three times as high as the business-as-usual detection rate.  In order to properly assess the relationship between the false-negative error rate (the important harmful errors of the model) and the volume of cases allocated by the model, these rates were plotted in Fig. 4. This was chosen as instead of the standard area under the curve or receiver operating characteristics (AUC or ROC) in this context as it clearly shows the relationship between case allocation and detection rate. What Fig. 4 shows is the range of results between setting the false-negative threshold at zero (where all cases would be allocated and none missed) and (at the other extreme) setting a falsenegative threshold at one, which would mean no cases were allocated and all detections would be missed. Strategic decision-makers can use this type of plot to make an evidence-based decision on the level of resourcing they have available and the cost of missed detections. In Fig. 4, a threshold of 0.5 was chosen for live application of the tool with the option of later refinement.

Phase 3: Tracking EBIT in Operational Application
The model went live on the 3rd of January 2018 across Kent Police and was used 7947 times over the first six months of 2018. The distribution of solvability scores across live cases was varied and unequal. The most two frequent crime categories had almost 1200 cases and included cases that were recently reported with no solvability factors (0.01) or had both support for prosecution and a named suspect (0.67). Overall, the model allocated 43% for further investigation (an increase on the 31% allocated in the retrospectively tested test sample) (Fig. 5).
Once secondary investigation was complete on allocated cases, we undertook an analysis of the outcomes in relation to the solvability score that EBIT calculated. Of the 7947 cases that went through EBIT, 3427 were allocated for further investigation, of these 175 were ultimately detected. The distribution of allocated case split by outcome  Fig. 6. The "violin plots" show the overall distributions of the two groups with detected cases having a modal value of 0.85, while undetected cases most frequently had a score of 0.67. The means of the two samples were statistically significantly different, with detected cases averaging 0.84 versus 0.76 for allocated undetected cases (t = 6.32, df = 3425, p < 0.01). The median solvability score was 0.8 in the detected group versus 0.75 in the undetected cases. In fact, the undetected median score was equal to the lower quartile of the detected cases meaning 25% of detected cases had a solvability score less than 0.75 compared with half of undetected cases.
An analysis of all outcomes shows that a small minority of cases that were not automatically allocated were still detected (Fig. 7). As the probability of solvability goes up, the percentage of cases that are detected also rises. It is a clear picture: the  Fig. 6 Distribution of the solvability score in detected and undetected allocated EBIT cases detection rate is 2.7 times higher for the most solvable quartile versus the least solvable (automatically allocated) quartile. The detection rate was comfortably over 10% for those cases that scored above 0.9 on EBIT solvability. There were also a small number of cases that were detected but were not automatically allocated by EBIT. The majority of these were due to subsequent reopening of the case due to new evidence coming to light, often CCTV or the identification of a named suspect. A small minority were identified via the public interest test component of EBIT and allocated for investigation as a result.

Conclusion
By using EBIT, Kent Police has been able to improve the efficiency of their investigations into minor non-domestic assault and public order cases. The model employed does a very good job of prioritizing cases based on solvability, with those deemed most solvable demonstrably being solved at higher rates than those predicted to be less solvable.
This demonstrated capacity to predict a detection is an advantage for police agencies wanting to prioritize cases that are reasonable uses of police time and resources. EBIT offers a substantial improvement in predictive accuracy over previous methods of allocation based on professional judgement alone or of allocating all cases for any high-volume offence category. Prioritizing case investments has become increasingly important in recent years in the UK. As crime recording standards have been adjusted to capture more reports of crime, in particular interpersonal crime, EBIT provides a fair and consistent method to decide which cases could be closed at the end of a preliminary investigation. A further benefit is the consistency and personalisation that EBIT provides the victims of crime. EBIT excludes all cases involving domestic abuse, hate crime or assault above a common assault. All remaining eligible cases receive a uniform investigative process and review criteria. Cases that meet the solvability threshold are automatically allocated, thereby removing regional (intra-county) or personal (officerto-officer) differences in allocation criteria that are usually inherent in investigative allocations. The public interest questions make sure that particular needs of the victim are identified and appropriately addressed and offenders get appropriate police attention. The result is all similar cases, victims and offenders get similar treatments based on external auditable criteria.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.