Introduction

In England and Wales, a person is reported as missing every 2 min (Fyfe et al. 2015). Missing person (“misper”) events can be costly for police services, in terms of time and resource (Shalev-Greene and Pakes 2012). The Home Office (2019) estimated a cost of £2415 for each misper case, more than double the police costs of investigating a robbery (£1010 per crime) and assault where an injury is sustained (£1130 per crime), and four times more than a dwelling burglary investigation (£530 per crime). Unlike those crimes, fortunately, most misper cases do not end in serious harm. If the time and costs of misper investigations could be shifted to the prevention of serious crime with no increase in risk of harm to missing persons, the public interest would be better served.

The aim of limiting the costs of mispers cases is the implicit premise for the College of Policing’s (2019) national framework for police classification of the risk of harm at time of a misper report. Yet that system has not been developed nor tested with systematic evidence. The aim of this study is to provide an empirical test, supported by insights from single-variable comparisons of the odds of the misper case leading to serious harm (as defined below). This study uses 92,681 misper records contained within the COMPACT (Community Policing and Case Tracking) records management system in Devon and Cornwall Police to explore the relationship between knowable facts at reporting and harmful outcomes for misper cases.

Context

Devon and Cornwall Police is the most South Westerly force in England. Covering nearly 4000 square miles, the force area includes the cities of Exeter and Plymouth, the expanses of Dartmoor and Exmoor and more miles of coastline than any other force in England and Wales. A thriving tourist industry sees the population swell from approximately 1.65 million to between 10 and 12 million each summer. All of this is policed by a little over 3000 officers.

The National Crime Agency (NCA) shows Devon and Cornwall accounting for 2.4% (7982 of 327, 735) of the misper incidents in England and Wales between 1 April 2017 and 31 March 2018 (National Crime Agency 2019). Ten more forces recorded more missing adults during the aforementioned period and only 5 more forces (The Metropolitan Police Service, Greater Manchester, West Yorkshire, Thames Valley, and Avon and Somerset) recorded more missing children. Devon and Cornwall Police recorded more missing people between 1 April 2017 and 31 March 2018 than any of the forces considered by HMICFRS as most similar (HMICFRS 2019).

The College of Policing (2019) states that risk assessment decisions should be informed by professional judgement and always be subjective. The fundamental flaw in taking a position that relies on officers’ experience and craft, however, is that around half of officers making decisions on misper investigations and risk assessments do not class themselves as experts or even appropriately trained (Smith and Shalev-Greene 2014). In addition, there are doubts surrounding the validity of the College of Policing question set that is currently provided to aid decision making. Officers who do not always have the required experience or skill are making subjective decisions on misper risk assessments with the aid of an untested and, possibly, flawed decision supporting tool.

Previous research paints a picture of (1) increasing mispers-related demand and (2) a diversity in the type of harm mispers are at risk from, depending on their age, gender and circumstances. While there is research that describes the various characteristics and vulnerability factors of mispers, there is a lack of understanding regarding the role of these factors in harm prediction, or if the predictive value of risk factors is conditional on age and gender.

Research Question

In light of the current state of knowledge, therefore, the key research question in this prospective analysis is this:

Given the information available to Devon and Cornwall Police when they receive a missing person report, what variables could be used to predict whether the reported person will come to serious harm?

This question leads to a series of detailed subquestions:

  1. a.

    How many missing person cases have been reported to Devon and Cornwall Police from 1 April 2008 through 31 March 2019? What proportion of these cases have come to harm?

  2. b.

    Across all variables about the cases which can be reasonably gathered at the time of the initial report, how strong are their bivariate associations with the odds of a missing person coming to harm?

  3. c.

    When the cases are divided into subgroups based on the gender and age of the missing person, does the strength or direction of associations with other individual factors and harm change from group to group?

  4. d.

    When missing persons are graded into high, medium and low risk categories by the person recording the report, how accurate are these gradings? In determining that level of accuracy, what is the rate of false positives and false negatives for various risk assessments?

Data

All persons reported as missing in Devon and Cornwall are recorded on COMPACT (Community Policing and Case Tracking), the missing persons case and records management system used by the force since April 2008. Each report of a misper is recorded under a reference number, known as a RID (report identification). The call taker, or police officer, in the police control room completes the information report while the reporting person is on the line, filling a number of standard fields that either require free-text responses (such as circumstances surrounding missing report) or defined “yes or no” options from drop-down menus.

COMPACT assigns each person reported as missing a personal identification number, known as a PID, a reference number unique to COMPACT. If an individual is subsequently reported as missing again, the report will have a unique RID, but the PID will be reused. This allows the COMPACT system, and those later reviewing it, to view all reports relating to specific individuals in one place.

From the 1st of April 2008 to the end date for our data collection, 94,689 missing reports were created by Devon and Cornwall Police. These records were extracted for this study from COMPACT and placed into a spreadsheet, displaying each RID as a line within the table. As a result, the unit of analysis here is an individual misper report, as identified by its RID. In order to answer each of the research questions, the initial risk grading (high, medium or low) and personal details of the missing person were recorded against each RID.

To enable reliable analysis of these records, a process of data cleaning took place, leading to the removal of the following:

No Found Report

Exactly 1607 records (1.7% of the total) were removed due to there being no report attached regarding whether the misper was eventually found, and what their status was when they were located. The absence of a found report makes it impossible to determine whether or not the misper came to harm.

Risk Gradings

Exactly 151 records (0.15% of the total) had ‘Not Specified’ or ‘Not Evaluated’ recorded as the initial risk grading on the report. As these records were not able to be placed in one of the defined risk categories described above, they were removed from the study. Another 2909 records (3% of the total) were recorded under the risk grading of ‘No Apparent Risk’. For the purposes of this study, these records were included with those recorded as low risk.

Gender

Two hundred fifty records (0.26% of the total) had either ‘Gender Unknown’ or ‘Transgender’ recorded under gender. In order to answer the research question relating to the value of characteristics in determining the probability of a misper coming to harm, the gender of the misper needs to be defined. This is not possible when the gender is recorded as ‘unknown’. COMPACT records all transgendered people together and does not separate transgendered males and transgendered females, which also prevents clear categorisation for the purpose of analysis. As a result, these 250 records were removed.

Final Data Set

In total, just 2008 records (2% of the total) were excluded from this study. This left 92,681 records to be analysed.

Methods

Definitions

In this study, we use the definition of ‘missing’ provided by the England Wales College of Policing (2019): ‘Anyone whose whereabouts cannot be established will be considered as missing until located, and their well-being or otherwise confirmed’.

The same Approved Professional Practice (APP) document (College of Policing 2019) defines “harm” as follows:

“Risk of serious harm has been defined as….:

‘A risk which is life threatening and/or traumatic, and from which recovery, whether physical or psychological, can be expected to be difficult or impossible.’”.

The College APP also provides risk categories of “High” (risk of serious harm very likely), medium (risk of harm likely but not serious), “Low” (risk of harm possible but minimal) and absent (no apparent risk of harm). Devon and Cornwall Police include “absent” in the ‘Low’ category.

If a misper is recorded as having come to harm, the options available to specify this harm in the records for this study are death, suicide, self-harm, mental harm, sexual assault and injury.

Analysis Plan

This section describes the analytic approach for each of the research subquestions.

  1. a.

    How many missing person cases have been reported to Devon and Cornwall Police from 1 April 2008 through 31 March 2019? What proportion of these cases has come to harm in each demographic group?

Of the 92,681 missing reports received by Devon and Cornwall Police over 11 years from 2008 through 2019, 3481 persons came to harm, or 3.8% of those reported, at a rate of 1 in 27 reports. The analysis disaggregates the harm rate by demographic categories.

  1. b.

    Across all variables about the cases which can be reasonably gathered at the time of the initial report, how strong are their bivariate associations with the odds of a missing person coming to harm?

For this question, the individual characteristics known at the time of reporting are used as the predictor variables in order to determine whether the presence of each single characteristic shows any increased the likelihood of the misper coming to harm. For example, how many mispers that were recorded as having, for example, a disability, came to harm compared to the proportion of those without a disability?

Odds ratios (ORs) show how the presence or absence of characteristic A is associated with the presence or absence of characteristic B (whether or not the misper came to harm). Odds ratios can be used to compare the relative magnitude of various factors for a particular outcome in a way that adjusts for the numbers of cases with each of the characteristics (Szumilas 2010). An OR was calculated to determine the relationship between each of the characteristics in Table 1 and the misper case resulting in harm. A p value was calculated in order to determine the statistical significance of each OR. This was particularly important as some of the predictors occurred in only a small portion of the sample, which invites questions regarding the reliability of the estimate. The presence of a p value smaller than 0.05 provides confidence that the increased OR was as a result of the risk factor being a reliable predictor of harm, and not just a chance relationship in this particular data set.

  1. c.

    When the cases are divided into subgroups based on the gender and age of the missing person, does the strength or direction of associations with other individual factors and harm change from group to group?

Table 1 Age and gender

The primary round of OR calculations (subquestion b) analysed all eligible records as the sample. In order to calculate the answers to this subquestion, the 92,681 records were subdivided into age and gender categories (see Table 1).

Six separate OR sets were created to calculate the relationship between each of the case characteristics and the misper cases resulting in harm, for each of the age and gender categories above.

  1. d.

    When missing persons are graded into high, medium and low risk categories by the person recording the report, how accurate are these gradings? In determining that level of accuracy, what is the rate of false positives and false negatives for various risk assessments?

In order to test the accuracy of the current risk assessment process, we must assess the frequency with which the outcome is correctly predicted. The definitions used by the police service in England and Wales all include the potential for the misper to come to harm: the difference is in the seriousness and likelihood of that harm. By analysing how many of the 92,681 mispers were placed in each risk grade, and how many of those did or did not come to harm, the analysis below describes the accuracy of the current APP process.

When we examine the predictive value of vulnerability factors and characteristics in the context of age and gender, the question of accuracy was also explored against each age category.

Having identified the proportion of mispers in each risk category, the final part of the research question was to determine the overall accuracy of the current process, and false positive to false negative ratio (FP:FN). In the context of this study, false positives are those mispers that were forecasted to come to harm, but did not, and false negatives are those that were forecasted to not come to harm, but did. The accuracy (ACC) is calculated through a confusion matrix, as shown in Table 2,

using a simple formula:

$$ \mathrm{Accuracy}=\left(\mathrm{A}+\mathrm{E}\right)/\mathrm{I} $$

The accuracy, or the percentage that the current process successfully predicts harmful and no-harm outcomes for mispers, along with the false positive to false negative ratio (FP:FN; i.e., B/D in Table 2), is reported in the results chapter of this study.

Table 2 Forecasting accuracy (“Confusion”) matrix

Findings

  1. a.

    How many missing person cases have been reported to Devon and Cornwall Police from 1 April 2008 through 31 March 2019? What proportion of these cases has come to harm, in each demographic group?

Table 3 shows that the demographic group at greatest risk of harm in this data set is women age 18–64, followed closely by men over 64. Their risks are four times higher than the risks for juveniles, who have the lowest risks of harm of any group by both age and gender. Put another way, the risk of harm is 1 for every 59 juveniles reported missing, compared with one in every thirteen women aged 18–64 or man over 65. Across all age categories, juveniles have the lowest risk of harm, with anyone over 18 exactly four times more likely, on average, to be harmed than those under 18.

Table 3 Percent of mispers harmed by age and gender
  1. b.

    Across all variables about the cases which can be reasonably gathered at the time of the initial report, how strong are their bivariate associations with the odds of a missing person coming to harm?

To explore the predictive potential of variables reasonably gathered at the time of the initial report, ORs were calculated for the entire sample size, using the characteristics and risk factors listed in Table 1 as predictor variables. Table 4 shows the OR for these risk factors as predictors. For example, the OR for visual impairment is 3.06, meaning that those mispers with a visual impairment are three times more likely to come to harm than those without a visual impairment (p = 0.002). Thus, in a sample size of 92,681 cases, there were 94 mispers recorded as having a visual impairment, and 10 of those were harmed (11%), which was about three times higher than the remaining 92,587 persons in the sample who were not visually impaired (11/2.8).

Table 4 Odds ratios for all mispers

Table 4 presents the odds ratios in descending order of magnitude for each characteristic recorded at the time of the misper report. The top seven are statistically significant indicators of increased likelihood of coming to harm compared with persons without those characteristics. The bottom three are statistically significant indicators of decreased likelihood of coming to harm compared to persons without those characteristics. The strongest association at with reduced risk of harm is one that is often taken to mean the opposite: a history of prior misper reports indicates a 43% lower chance of coming to harm than persons who have no prior reports.

The data in Table 4 are displayed in Fig. 1, so that the reader can visualize the relative differences in ORs across these single variables examined one at a time. They do not, however, indicate anything about the possible combinations of these factors, which could change the interpretation of each characteristic substantially. That possibility will become very clear in the presentation of these analyses when separated by age and gender groups.

Fig. 1
figure 1

Logarithmic scale displaying comparison of odds ratios for all mispers

  1. c.

    When the cases are divided into subgroups based on the gender and age of the missing person, does the strength or direction of associations with other individual factors and harm change from group to group?

Examining the characteristics at time of report in relation to harm by demographic subgroup is a form of “moderator” analysis. In this case, it shows whether the demographic classification moderates (changes) the effects of each characteristic. Because the proportion of mispers who come to harm varies across age and gender categories (Table 1), this section explores the importance of variation between each group in the magnitude of each characteristic’s odds ratios, based on age and gender.

Juveniles

ORs were calculated for the 54,603 juveniles (59% of the cohort). The results are displayed in Table 5.

Table 5 Odds ratios for juvenile mispers

Table 5 shows the strongest association between harm and a personal characteristic in the under-18 age group is reported suicidal threats or ideation. In this context, it raises the OR to 4 times that of nonsuicidal mispers. Nonetheless, the absolute risk for the suicidal group is still only 6.2%, which is lower than it is for the entire misper group over age 65 (regardless of their coding as suicidal or not), which has an overall risk of 6.8%, and 7.3% for males.

While for the entire cohort the variables of learning disability, CSE risk, in care and repeat missing person were negatively associated with harm, for juvenile mispers, all four of these factors change direction, and become significant predictors of harm. Figure 2 presents each of these odds ratios in visual form.

Fig. 2
figure 2

Logarithmic scale displaying comparison of odds ratios for juvenile mispers

‘Suicidal’ remains as a significant predictor of harm (OR = 4, p ≤ 0.001), as does mental illness (OR = 2.2, p ≤ 0.001) and disability (OR = 1.78, p ≤ 0.001). Being female is a predictor of harm in the whole sample size (OR = 1.07, n.s.), but for juveniles, this value is now significant (p ≤ 0.001) and the OR increases to 1.80.

Separating the juvenile females from the juvenile males provides further insight into the predictive nature of risk factors. Figure 3 shows that almost all the predictor variables are positively associated with harm for both sexes, but major sex differences were evident: for males only being ‘in care’ was a statistically significant predictor, whereas for females all predictors were statistically significant except for dyslexia and disability.

Fig. 3
figure 3

Comparison of OR for juvenile males and juvenile females

Adults Aged 18–64

Adult mispers in this age group account for 35% of the cohort but 63% of the harm outcomes. Table 6 shows the OR for each characteristic among these 32,708 adults under 65 year of age.

Table 6 Odds ratios for all adults under 65

The characteristics with the largest elevation of risks in the adults under 65 group are coding as suicidal (OR = 5.42), mental illness (OR = 1.38), female (OR = 1.18, p ≤ 0.001) and disability (OR = 1.14, p ≤ 0.001). These are all statistically significant predictors of harm, as they are with juveniles. Figure 4 displays these OR on a logarithmic scale.

Fig. 4
figure 4

Logarithmic scale displaying comparison of odds ratios for adult mispers

The 18–64 adult category was split by gender in order to further examine the importance of risk factors in predicting harm. The resulting OR and p values are shown in Tables 7 and 8.

Table 7 Odds ratios for adult female mispers
Table 8 Odds ratios for adult male mispers

‘Suicidal’ remains the biggest predictor of harm for both male and female adults under 65. Only 5 of the predictors are positive indicators of harm for adult females. Four of these—suicidal, reduced mobility, mental illness and disability—are also predictors of harm for adult males, albeit with differing OR. Learning disability, which is a predictor of harm in juveniles, is negatively associated with harm for 18–64 adults of both genders. Figure 5 illustrates the difference in OR, and predictive value of risk factors, for males and females.

Fig. 5
figure 5

Comparison of OR for adult males and adult females

Figure 5 reveals the first result has shown a risk factor that acts a significant predictor of harm, or is negatively associated with harm, depending on the gender of the misper. Adult males who have not been reported missing previously are more likely to come to harm than their female counterparts (OR = 1.38, compared with OR = 0.82). Conversely, repeat female mispers are more likely to come to harm than repeat male mispers (OR = 1.22 compared with OR = 0.72).

Over 65s

Although the size of this subgroup is small (5370) compared with the other cohorts in this research, by itself, it would serve as one of the largest sample sizes used in the missing persons literature. This cohort also accounts for the least amount of harm (11%, n = 364) of the 3481 identified in this study as having come to harm. As a consequence, the smallest n for many of the risk factors present is low. The lack of odds ratios identified as significant after calculating the p value is clearly shown in Table 9.

Table 9 Odds ratios for over 65 mispers

The comparison of OR for all mispers aged over 65 is shown in Fig. 6.

Fig. 6
figure 6

Logarithmic scale displaying comparison of odds ratios for over 65 mispers

Contrary to results for mispers in the other two age categories, mental illness (OR = 0.92) is negatively associated with harm for those over 65. Tables 10 and 11 show the OR and p values for female and male mispers in this age category.

Table 10 Odds ratios for over 65 female mispers
Table 11 Odds ratios over 65 male mispers
  1. d.

    When missing persons are graded into high, medium and low risk categories by the person recording the report, how accurate are these gradings? In determining that level of accuracy, what is the rate of false positives and false negatives for various risk assessments?

Starting with Table 12, the following tables focus on the initial risk grading of high, medium or low risk and the rates accuracy associated with those gradings. Table 12 shows the distribution of the percentage of cases leading to serious harm over the three levels of risk classification. Like most such distributions of professional judgements, it is heavily weighted towards the middle, even though the overall distribution of the actual harm is skewed towards low.

Table 12 Proportion and percentage of all mispers in each risk category

Table 13 shows the distribution of harm across high risk vs. a group that combines the medium and low risk predictions:

  • When high risk was predicted, there was no actual harm in 89% of cases

  • When harm actually occurred, it was not predicted as high risk in 59% of cases

Table 13 Matrix of distribution of harm across all mispers (confusion matrix)

As reported above, 3.8% of all 92,681 reported mispers in this study came to harm. Unsurprisingly, the risk category with the greatest proportion of mispers that came to harm was high risk, where 10.8% (n = 1436) of the 13,260 misper records graded as high risk resulted in harm. However, as shown in Tables 13, 59% level of false negatives (cases not classified as high risk that actually came to harm was almost six times higher than the rate of true positives (cases that were classified as high risk that did come to harm). Moreover, the high false positive rate (almost 90%) for high risk classifications denotes a very high level of wasted effort.

More details on how these predictions varied by demographic grouping are reported in Doyle (2019).

Conclusions

With a total sample of 92,681 cases, this is by far the largest known study concerning mispers and their associated risks. Sample size, fortunately, is important for this subject, because the statistical rate at which missing persons come to harm makes the risk assessment analysis of specific characteristics difficult to accomplish. The results of this study allow more reliable conclusions to be drawn on the accuracy of the current risk assessment process, the value currently placed on professional judgement and subjective decision making, and the level of resources used by police services conducting misper investigations.

The College of Policing (2019) provided the question set currently used by police to aid the subjective risk assessment of mispers, which has been criticised for not being evidence-based (Coffey 2018). It is also not adaptable depending on the age and gender of the misper. The results of this study show, quite clearly, an inequality in the spread of harm across mispers categorised by age and gender, and an overall error rate of substantial proportions. A false negative rate of nearly 60% means that the high risk classification was not applied to three out of five cases that did come to harm, while the false positive rate of 89% means that nine out of ten searches added questionable value.

A way forward

A more sophisticated approach to predictive modelling would provide greater clarity with regard to the high inaccuracy of the current predictive practices. The random forest approach to forecasting (Barnes 2019; Barnes and Hyatt 2012; Berk et al. 2009) computes likelihood through a series of random decision trees, and provides far greater accuracy than OR calculations, regardless of the sample size.

One key finding of this study provides further evidence for the value of a random forests model: that the predictive value of risk factors is conditional on age and gender. While the over age 65 category, for example, has many fewer cases than other ages, their data can be built into a multivariate model. As the stream of odds ratios in this article demonstrates, there is too much information for even experts to hold in their heads. Using a random forests model built from 90,000 cases or more would likely lead to much greater precision.

This study supports the assertion of the All Party Parliamentary Group for Runaway and Missing Children and Adults (Coffey 2018) that the current process lacks efficacy and is not accurate. The subjective nature of the current process relies heavily on the experience and knowledge of the officers making the risk assessment, but Smith and Shalev-Greene (2014) have found that nearly half of the officers in one English force making these decisions do not feel adequately trained to do so. The current approach of utilising officer experience to assess risk, and forecast whether or not a misper is going to come to harm, is far less accurate than simply grading all mispers as low risk and predicting that none of them will come to harm.

This study has shown that the current process for assessing the risk of harm to a misper does not do so accurately, but that predicting harmful outcomes is possible. The primary recommendation from this research, therefore, is a move away from the subjective, one size fits all, method currently endorsed by the College of Policing, to a new evidence-based process for assessing misper risk. The new model should also consist of more than three risk grades, with more specific gradings to enable police services to respond more appropriately. Meanwhile, policing faces one of its largest areas of demand in the investigation of missing persons (mispers), with 327,735 missing or absent person incidents created in the 2016/2017 financial year (National Crime Agency 2019) alone. To put this number in perspective, it amounts to 2.7 incidents for every police officer across the country.

In an era where forces such as Kent Police have introduced evidence-based crime triage tools, the police service should certainly be replicating this approach for an area of demand that impacts so heavily on their time and finances (Shalev-Greene and Pakes 2012), as well as the emotional and physical wellbeing of mispers and those who care about them.