Findings

Police investigators occasionally seek the support of specialists in various fields. Cases of murder and rape, for instance, prompt the need to utilize all available resources to prevent future offending by the perpetrators, and serial offenses (believed to have a single perpetrator) can prompt the employment of consultants to link the crimes and anticipate likely sites of future offending (or the offender’s “home base”; Rossmo 20002009; Woodhams et al. 2007). The statistical training and specializations of academic criminologists and psychologists make them candidates for such consultancy (Alison and Rainbow 2011). In the United Kingdom (and some other Western countries) law enforcement agencies have such consultants on staff. The task of these professionals is referred to as Behavioural Investigative Advising (BIA).

The field of BIA is young and still establishing professional and scientific standards (Dowden et al. 2007; Alison and Rainbow 2011). The research literature and empirical basis of BIA are rapidly expanding and improving (Dowden et al. 2007; Almond et al. 2011). Investigators have reported that BIA consultancy is useful both as a second opinion and as a decision support tool (Rainbow 2011). This tool aims to be accurate, useful, specific, and falsifiable (Alison et al. 2003). This assures the consultancy is beneficial to police and allows for the product to be evaluated after the investigation.

The advising process can be summarized generally as using the knowns of an investigation to estimate unknowns useful to investigators; for example, moving from the known locations of a series of crimes to the possible residence or workplace of the offender (Rossmo 2000). BIA consultants can assist in locating, describing, and prioritizing suspects by contributing scientific knowledge and formal analysis of “national datasets and other relevant base rate data” (Rainbow et al. 2011p. 37). That is, their contribution is the assimilation of research literature, evidence, and context to optimize decision making.

Due in part to its recent genesis as a scientific field of study, there are a multitude of quantitative approaches used by BIA professionals to arrive at estimates for decision support. The vast majority of these (e.g., correlation, Jaccard’s indices, chi-square tests, logistic regression) may aptly be called “frequentist”. That is, the majority of approaches involve either interpreting likelihoods from frequency data or utilizing null hypothesis significance testing to interpret estimates of unknowns.

Bayesian statistical inference is the algorithmic combination of previous and new data to obtain the probability of one or more causes producing the new data (Gill 2009; de Morgan 1838). This is different from inferring the simple probability of said data being observed (randomly or otherwise), which is the cornerstone of the more commonly used frequentist methods.

Bayes’ theorem formally combines quantifications of one’s pre-analysis information (a prior), some base rate criminological and demographic data (a normalizing constant), and a likelihood of obtaining one’s evidence. As shown in Figure 1, the prior and likelihood are multiplied together and divided by the normalizing constant, yielding one’s new conclusion or estimate (the posterior). This is more generally expressed as: The probability of a hypothesis (H) given an observation (O) is equal to the probability of obtaining the observation given the hypothesis is true, multiplied by the prior probability of the hypothesis, divided by the unconditional probability of obtaining the observation.

Figure 1
figure 1

Bayes’ Theorem expressed in a) probability statements, b) Bayesian terms, and c) investigative language.

Key distinctions between Bayesian and frequentist (also called Fisherian) approaches to BIA estimation are the use of a null hypothesis and the use of prior information. Bayesian logic involves treating data as constant and modelling one’s belief about relationships in the data based on the context of the data and the data, whereas frequentist logic involves treating the data as random, ignoring the context of the information so as to be “objective”, and—typically—evaluating the existence of a relationship from the initial standpoint of the assumption that no relationship exists. Table 1 details key relevant differences between Bayesian and frequentist approaches to statistical inference. Note, however, that some exceptions to these differences may exist, especially when considering very simple applications of Bayes’ theorem and very complex applications of frequentist statistics.

Table 1 Differences between Bayesian and Frequentist/Fisherian approaches to investigative inference

Bayes’ theorem can be effective both as a tool and as an analogue to the logical problems faced by investigators. Tartoni et al. (2006) note that Bayesian analysis is well-suited for nearly all aspects of forensic investigation, and Schneps and Colmez (2013) illustrate the grievous errors that can occur when cases are built solely based on an isolated frequentist analysis of the evidence. For example, calculating a simple 1 in 6 chance of identifying an offender from a line-up versus a 1 in 12 chance may lead one to believe that having more individuals as foils in a police line-up increases the posterior probability that an accurate match was made. Wells and Turtle (1986) noted that this is not the case. They also shed empirical light, using a Bayesian updating model, on the practice of having all-suspect line-ups, which they found increases the risk of false identification.

Blair and Rossmo (2010) tackle the issue of assigning prior probability values for decision support. They argue that a Bayesian approach can improve estimation of guilt, and suggest assigning probability ranges to single or multiple pieces of evidence. They note that this does not solve the problem of assigning “guilt” values to pieces of evidence, but the approach can result in “more systematic assessments and improved investigative decision making” (Blair and Rossmo 2010p. 133). On a cautionary note, when using databases of convicted criminals to estimate guilt, both the Bayesian and frequentist statistical approaches may perpetuate biases in a system of justice. That is, using the “usual suspects” to predict characteristics of offenders could lead to further focus on these individuals at the expense of other potential investigative leads. The Bayesian approach is not immune to this criticism, though it is less vulnerable to the specific claim that its inherent logic is biased to this conclusion. Frequentist approaches assume the validity of a null hypothesis, that is, they assume the predictor and outcome variables may legitimately be thought to not be related. When this logic is used to evaluate a candidate suspect whose prior offenses are used in the model quantifying his guilt, this assumption is grossly violated and the logic of the frequentist estimator is circular. That is, the offender’s statistical relationship to himself is used as evidence against him because the test, in assuming no relationship, finds his relationship to himself “significant”. In frequentist approaches, this is a violation of the logic of the method. In Bayesian approaches this is not a logical violation (since no null assumption is required and the context of the information is adequately incorporated). However, the potential for an offender’s resemblance to himself to make his candidacy as a suspect more likely still remains. The potential for this concern should be considered when using any statistical method to parse local databases for BIA consultancy.

Table 2 presents a procedural comparison of two approaches to investigative advising, taken from Salo et al. (2012) and Allen et al. (in press). These papers empirically compare Bayesian to non-Bayesian prediction for investigative advising. Salo et al. (2012) informs column a. The study compared use of a Bayesian updating model with a dimensional model to link homicide cases using only offender behavioural information (i.e., only details of what the offender did). Both models utilized identical real-world data. The Bayesian approach, by better accounting for absent information, resulted in 83.6% of cases being correctly classified, versus 62.9% by the dimensional approach. Allen et al. (in press) informs column b. The study compared an empirical Bayesian approach to a “pared-down” base rate method of estimating offender characteristics. The Bayesian approach, by incorporating more contextual information, resulted in 74.6% prediction accuracy versus 63.5% accuracy of the base rate method.

Table 2 Procedural comparisons based on a (highly simplified) investigative advising example

Bayesian methods are subject to a disproportionate amount of criticism for being “subjective” and prone to misuse (e.g., Doren 2006). This is due in part to the forthright philosophy of Bayesian analysis, which formally “confesses” that Bayesian estimates, like all other estimates, are a product of, and representative of, beliefs about the hypothesis being explored. Popperian objectivity requires that the statements and evidence be entirely in observable space (Popper 1972). Therefore, provided all the values used in an analysis are thoroughly explained and justified, Bayesian methods are no less objective than their frequentist counterparts (which involve many subjective choices).

Bayesian methods can formally contextualize, and thus improve, frequentist analysis. In the 20th century, insurance companies used Bayesian inverse probability, contrary to a rabidly Fisherian zeitgeist, without knowing that their computations were incorporating Bayes’ theorem (McGrayne 2011). Similarly, courts in the United States have been using Bayesian risk assessments (Donaldson and Wollert 2008; Wollert 2007) while also lambasting Bayesian approaches (e.g., Doren 2006). Conversely, BIA research has largely used frequentist methods to perform a fundamentally Bayesian task. Whatever the reputation of Bayesian analysis, the task and field of BIA are fundamentally Bayesian. A Bayesian approach to investigative advising is therefore the most logical and promising way forward.