Introduction

According to the 2022 Global Terrorism Index (Institute for Economics and Peace 2022), acts of violent extremism have increased by 17% in recent years and represent a major national security issue. Further, surveys find that over one-third of community members across Europe and North America self-report feeling concerned about becoming the victim of such extremist acts (Gallup 2022; Voronova 2021). As a result, public demand has increased for effective methods of violent extremism threat assessment and management by security authorities, including the police. A critical bridge between the valid and reliable assessment of threat – operationalized as targeted risk – and its evidence-based management is successful risk communication, defined by the World Health Organization (2017) as “the real-time exchange of information, advice and opinions between experts or officials and people who face a threat” (p. 1). The research literature on risk communication best practices in the behavioral sciences has grown considerably over the past three decades (Hilton et al. 2015; Rossegger et al. 2022).

Contemporary Approaches to Violent Extremism Risk Communication

More than 400 structured instruments are currently in use by security professionals (including police) across the globe for the purposes of assessing interpersonal violence risk (Singh et al. 2016). In recent years, these have come to include instruments specifically designed for use in violent extremism threat assessment (Van der Heide et al. 2019). Such instruments include structured checklists comprised of empirically validated risk and/or protective factors which are static or dynamic in nature (Singh 2012). Representing a significant improvement in both predictive validity as well as inter-rater reliability over unstructured professional judgments of dangerousness (Viljoen et al. 2021), these instruments broadly rely on either the actuarial or the Structured Professional Judgment (SPJ) frameworks. Actuarial risk assessment instruments (ARAI) rely on sum scores to make a final judgment, whereas SPJ instruments rely on the informed discretion of the assessor to make such a judgment.

When commonly used actuarial and SPJ instruments are compared (see Table 1), it becomes clear that they communicate the likelihood of future violence in one of four ways. First is a numeric, norm-based approach in which sum scores are used to place individuals into one of several categories, each of which is associated with a group-based probabilistic estimate of future violence within a specific timeframe based on a normative sample of similar individuals. An example of an instrument which adopts this approach is the Violence Risk Appraisal Guide-Revised (VRAG-R; Harris et al. 2015), which communicates risk across nine numeric categories, with normative tables published in the measure’s manual.Footnote 1 Second, the likelihood of future violence can be communicated using a numeric, ordinal approach in which sum scores are used to place individuals into a category, although those categories do not have associated probabilistic estimates of future violence. An example of an instrument which adopts this approach is the Brøset Violence Checklist (Almvik and Woods 1998), which communicates risk across three numeric categories based on sum score.Footnote 2 Third is a semantic, ordinal approach in which the assessor communicates risk exclusively using labels not based on underlying probabilistic estimates or sum scores. An example of an instrument which adopts this approach is the Structured Assessment of Violent Extremism-30 (SAVE-30; Dean and Pettet 2017), which communicates risk across four semantic categories: Minimal, Low, Moderate, or High. Fourth and finally, the likelihood of future violence can be communicated using a semantic approach in which there are no discrete categories. Rather, the result is an actionable case formulation and scenario plan. An example of an instrument which adopts this approach is the Terrorist Radicalization Assessment Protocol-18 (TRAP-18; Meloy 2017).

Table 1 Overview of Risk Communication Approaches in Established Risk Assessment Instruments

Regardless of which of these four approaches is used to communicate about the likelihood of future danger, it is important to take into consideration the role of cognitive bias in how that information is understood by the intended recipients. For example, although forensic mental health professionals and mock jurors say they prefer semantic over numeric categories (Dieckmann et al. 2009; Evans and Salekin 2014; Falzer 2013; Varela et al. 2014), research suggests that they still subconsciously apply probabilistic estimates to semantic categories, just in a highly subjective and unreliable way. To illustrate: In a study of psychologists, psychiatrists, and nursing staff, participants assigned probabilistic estimates of future violence ranging from 0 to 51% for semantically “Low Risk” individuals and estimates ranging from 5 to 80% for semantically “High Risk” individuals (Rettenberger et al. 2016). Such findings are particularly important because, regardless of how many semantic or numeric categories an assessment instrument may have, research suggests that there is still a tendency to dichotomize the likelihood of future violence into “Low Risk” and “High Risk” (Krauss et al. 2018). That said, in addition to estimating the likelihood of future violence, risk communications should also include the predicted severity, frequency, and imminence of that violence, as well as whether a particular case should be prioritized (Douglas 2014; Douglas et al. 2014).

Violent Extremism Risk Communication in Germany

Although there is a thriving research literature on methods of violent extremism threat communication used by police in Western Europe (Ahmed et al. 2021), North America (RTI International 2018), and Australia (State of Victoria Department of Premier and Cabinet 2017), studies on risk communication approaches by security professionals in Central EuropeFootnote 3 are scarce. Indeed, a systematic search conducted using PsycInfo and PubMed (January 1, 2003 to October 9, 2022) revealed no published surveys investigating the use and perceived utility of currently used risk communication schemes by Central European police agencies, including in the region’s most populous nation: Germany.

Germany is a Federal Republic in Central Europe composed of 16 states. The Federal Criminal Police Office (Bundeskriminalamt; BKA) is the country’s central security agency responsible for the assessment and management of violent extremism threats. Such threats relate to possible violent actions which would endanger national security and are judged to be sufficiently probable to occur in the imminent future (cf. Kretschmann and Legnaro 2019; 1 BvR 966/09). Intelligence gathered by other federal and state agencies on potential threats is forwarded to the BKA such that it can arrange for formal assessment and, if warranted, the development of a threat management plan.

Once intelligence on a threat of violent extremism has been received, the threat assessment process in Germany is undertaken in four steps: First, BKA officials determine whether the BKA is the correct agency to handle the threat-related intelligence and, if so, to which internal department it should be given. During this step, as much information as possible is collected on the nature of the threat, who reported it, and how it was reported. Second, assessors evaluate the credibility of the threat, including the trustworthiness of the person(s) who reported it as well as the plausibility of the predicted danger. Third, all collected information on the threat is combined in a written case formulation. A key part of this process is planning for two scenarios, one pessimistic and one optimistic. The pessimistic scenario assumes that an extremist act of violence will take place, and the optimistic scenario assumes that no such act will take place. The pessimistic scenario is assumed to be true unless there is compelling evidence to the contrary. When there is insufficient information to decide between these two formulated scenarios, a list of what is missing will be created and as much of that information gathered as possible to be able to confidently rule out the threat. Fourth and finally, the findings of the case formulation are summarized via a risk communication scheme.

The 8-Category Risk Communication Scheme

In an effort to develop a common language with which to communicate risk across different federal and state police agencies, the BKA established the 8-Category Risk Communication Scheme (8-RCS) in the 1990s (Rehkopf 2022). The ordinal scheme is comprised of eight semantic “levels” into which threats to public safety are classified (Level 1 = harmful event is expected, Level 8 = harmful event can be ruled out; Deutscher Bundestag 2017). Although the scheme is conceptualized such that cases at lower levels represent a higher probabilistic threat (and vice versa), the 8-RCS levels are not anchored by normative reference standards. As no standard set of risk and/or protective factors are taken into consideration when determining into which of the eight levels a case falls, the 8-RCS can best be viewed as a summary of the case formulation process rather than as an actuarial or SPJ instrument. Notably, validated actuarial and SPJ violent extremism threat management instruments most often have three to four semantic ordinal levels (see Table 1), making the 8-RCS unique in its extensive scaling. Once an individual has been assigned to one of the eight risk levels, this information is forwarded to all relevant federal and state security authorities as quickly as possible.

Present Study

The aim of the present field study was to conduct the first survey of violent extremism threat management professionals at the federal and state levels of the German police to evaluate the use and perceived utility of the 8-RCS as part of a comprehensive case formulation process. Participants’ level of threat assessment and management experience with different kinds of politically motivated violent extremism was measured, as were beliefs about the reliability and construct validity of the eight levels of the currently used risk communication scheme. Finally, given the emphasis on investigating risk communication practices with the hopes of improving the efficacy of inter-agency collaboration, evidence-based recommendations for how to improve the case formulation and 8-RCS process will be provided.

Method

Participants

Survey participants were recruited via an invitation e-mail sent by the BKA to federal and state police agencies in Germany. The survey link was opened by 207 individuals. After excluding individuals who answered no questions or only answered demographic questions (n = 39), who were not involved in violent extremism threat management (n = 7), or who rated all ordinal questions at maximum values (n = 3),Footnote 4 the final participant count was 158.

Measures

An online survey comprised of both closed as well as open response questions and designed to take approximately 25–30 min was constructed using Qualtrics (Qualtrics 2020). The questions focused on the BKA’s 8-RCS and were drafted by a team of three psychologists (AR, LR, MW) at the University of Konstanz and one criminal justice specialist (VP). The drafted questions were reviewed and finalized by the BKA, resulting in a survey containing 21 questions, with a further 7 questions asked contingent upon participant responses using conditional logic. The survey covered the following four blocks: (1) Demographics, (2) threat management experience, (3) perceived utility of case formulations and the 8-RCS, and (4) perceived meaning of the 8-RCS levels. To encourage survey completion, only the questions in the demographics block were required to be answered.

Collected demographic information included gender (0 = Male, 1 = Female) and age (in years). The amount of threat management experience in the police context was measured (in years) as was the area of politically motivated threats which participants specialized in (0 = Religious Ideology, 1 = Right-Wing Ideology, 2 = Left-Wing Ideology, 3 = Foreign Ideology, 4 None of These). Whether participants conducted threat assessments, themselves, or whether they received the findings of threat assessments from a different agency was also measured (0 = Primarily Internal Assessment, 1 = Primarily External Assessment). Among those participants who conducted threat assessments, themselves, the frequency with which such assessments are carried out was examined (0 = Daily or Weekly, 1 = Monthly, 2 = Once Every Several Months, 3 = Rarely).

The perceived utility of case formulations (written assessments including the 8-RCS) and the 8-RCS (specifically) was rated on a six-point Likert scale (1 = Very Helpful; 6 = Not at All Helpful). The perceived meaning of the 8-RCS levels was measured in two ways: First, participants were asked to estimate the probability of a violent attack in each of the 8-RCS levels (out of 100%). Second, participants were asked at which of the level of the 8-RCS does a threat become so severe as to justify police intervention through either covert tracking or overt confrontation of the individual. Finally, whether participants believe (1) two threat assessors using the same information usually arrive at the same 8-RCS level, (2) two recipients of threat assessment findings usually have the same understanding of each 8-RCS level, and (3) the 8-RCS rating adequately reflects risk level (i.e., construct validity) was rated on a six-point Likert scale (1 = Strongly Agree, 6 = Strongly Disagree).

Statistical Analysis

Statistical analyses were conducted using Stata IC 16 (StataCorp 2019). The findings of close-ended questions were analyzed descriptively, with sensitivity correlational analyses being conducted to explore the potential moderating impact of gender, average age, and years of threat management experience on perceived utility of case formulations and the 8-RCS as well as on the perceived meaning of the 8-RCS levels. The threshold for statistical significance was set to α = 0.05. Answers to open-ended questions were analyzed via a two-step Qualitative Content Analysis approach (Mayring 2015; Mayring and Fenzl 2019). In the first step, response clusters were established for each of the questions by MW and LR. Responses could be classified as belonging to more than one cluster. An inter-rater reliability check was then conducted by an independent rater who was a post-doctoral fellow in the Department of Psychology at the University of Konstanz. Disagreements were resolved through discussion between the three concerned researchers.

A priori χ2 and t-tests were conducted to assess whether there were systematic differences in gender, average age, and years of threat management experience between the 158 participants included in the final analyses and participants who were excluded on account of not being involved in violent extremism threat management, having answered only demographic questions or having rated all ordinal questions at maximum values. No statistically significant differences in gender (χ2(2) = 2.58, p = 0.275) or average age (t(185) = 0.83, p = 0.410) were found. However, there was a statistically significant difference in threat management experience, whereby excluded participants had fewer years of such experience than those who were included, t(186) = 2.14, p = 0.033.

Results

Demographics and Threat Management Experience

Survey participants included 158 police officers on violent extremism threat management teams in Germany. The majority of participants self-identified as male (n = 100, 63.69%), with an average age of 43.08 years (SD = 9.84) and an average of 7.94 years (SD = 7.22) of threat management experience in the police context.

Participants specialized in the management of politically motivated threats inspired by religious ideology (n = 119, 75.32%), right-wing ideology (n = 24, 15.19%), foreign ideology (n = 10, 6.33%), or left-wing ideology (n = 5, 3.16%). The majority of participants (n = 88 of 145, 60.69%) were responsible for acting on threat assessments they conducted, themselves, whereas the remainder were expected to act on threat assessments received exclusively from a different agency. Of those 88 participants who conduct threat assessments, themselves, 23 (26.14%) do so daily or weekly, 28 (31.82%) do so monthly, 12 (13.64%) do so once every several months, and 25 (28.41%) do so on rare occasions.

Perceived Utility of Case Formulations and the 8-RCS

As displayed in Fig. 1, almost all of the participants rated case formulations to be either “Very Helpful” (n = 56 of 90, 62.22%), “Helpful” (n = 22 of 90, 24.44%), or “Somewhat Helpful” (n = 10 of 90, 11.11%), with a minority (n = 2 of 90, 2.22%) rating case formulations as “Somewhat Unhelpful” or “Unhelpful”. The majority of participants rated the 8-RCS scheme as “Very Helpful” (n = 16 of 79, 20.25%), “Helpful” (n = 43 of 79, 54.43%), or “Somewhat Helpful” (n = 15 of 79, 18.99%). The remainder (n = 4 of 79, 5.06%) rated the 8-RCS as either “Somewhat Unhelpful” or “Not Helpful”.

Fig. 1
figure 1

Perceived Utility of Case Formulations and the 8-RCS

In an open-ended follow-up question, four response clusters were identified as to the major perceived benefits of the 8-RCS: (1) Aids in developing actionable threat management plans of appropriate scope and intensity (n = 25 of 62, 40.32%), (2) provides a nationwide standard for the uniform communication of risk (n = 17 of 62, 27.42%), (3) represents an objective threat summary for individual cases (n = 8 of 62, 12.90%), and (4) provides a reference against which to compare one’s own threat assessment (n = 9 of 62, 14.53%).

Perceived Meaning of 8-RCS Categories

As displayed in Fig. 2, when participants were asked to estimate the probability of a violent attack for each of the 8-RCS levels, cases in Level 8 received the lowest estimates and cases in Level 1 received the highest estimates. However, ranges overlapped considerably, and there was particularly notable overlap between estimated violence probabilities between cases in Levels 4, 5, and 6.

Fig. 2
figure 2

Box-and-Whisker Plot of Probability Estimates for a Violent Attack at Each Level of the 8-RCS

When participants were asked at which level of the 8-RCS a threat becomes so severe as to justify police intervention through covert tracking (see Fig. 3), each of the following levels was identified as the cut-off threshold by approximately one-fourth to one-fifth of the sample: Level 6 (n = 19 of 71, 26.76%), Level 5 (n = 17 of 71, 23.94%), Level 4 (n = 14 of 71, 19.72%), or Level 3 (n = 19 of 71, 26.76%). Overt confrontation by police was most often perceived to be justifiable with a cut-off threshold of either Level 3 (n = 25 of 68, 36.76%) or Level 2 (n = 18 of 70, 26.47%).

Fig. 3
figure 3

Justifiable Threat Management Interventions Association with Levels of the 8-RCS

Despite the variability in estimated violent attack probabilities and intervention cut-off thresholds across the levels of the 8-RCS, Table 2 reveals that four-fifths of participants believe two threat assessors using the same information usually arrive at the same level (n = 65 of 82, 79.27%). Three-fourths of participants also believe that two recipients of threat assessment findings usually have the same understanding of each 8-RCS level (n = 62 of 82, 75.61%). Finally, more than four-fifths of participants believe that the 8-RCS rating adequately reflects risk level (n = 69 of 82, 84.15%).

Table 2 Shared Beliefs About 8-RCS Levels Rated on a Likert scale (1 = Strongly Agree, 6 = Strongly Disagree)

Sensitivity Analyses

No statistically significant moderating effects were found for participant gender or years of threat management experience on perceived utility of case formulations and the 8-RCS as well as on the perceived meaning of the 8-RCS levels. In addition, no significant relationship was found between participant age and the perceived meaning of the 8-RCS levels. However, a statistically significant correlation was found between participant age and perceived utility of the 8-RCS (rs = −0.24, p = 0.032) such that younger participants rated the 8-RCS as being less helpful. Younger participants were also found to rate case formulations as being significantly less helpful (rs = −0.25, p = 0.017).

Discussion

Violent extremism is a major public safety and national security issue, resulting in increased demands for reliable and accurate methods of threat assessment and management. For cooperation in preventing threats to be successful across federal and state police agencies, effective methods of risk communication are essential. Although much is known about numeric and semantic risk communication systems in Western Europe, North America, and Australia, the research literature on this important subject in Central Europe is virtually nonexistent. To address this, the present study surveyed violent extremism threat managers at federal and state police agencies in Germany to evaluate the use and perceived utility of the eight-level risk communication scheme which is currently part of mandatory police operating procedures. The sample consisted of 158 predominantly male police officers with an average of 8 years of experience, approximately two-thirds of who conducted threat assessments, themselves, and specialized in politically motivated threats of violent extremism inspired by religious ideology.

Both case formulations as well as assignment of an 8-RCS level are carried out by violent extremism threat assessors without the use of a consistent, manualized process. And although participants agreed that the status quo aids in developing actionable threat management plans of appropriate scope and intensity, the answers to two key close-ended questions revealed evidence of a lack of reliability and construct validity, both of which are hallmarks of unstructured professional judgments (Viljoen et al. 2021). First, when asked to estimate the probability of a violent extremist attack by individuals in each of the 8-RCS levels, there was considerable within-level variability in the predicted likelihood of an attack being carried out. Second, there was a lack of consensus as to when a communicated threat of violent extremism becomes so severe that police intervention is warranted through either covert tracking or overt confrontation. These results reflect a lack of awareness, as three-fourths of participants expressed that they believe two recipients of threat assessment findings usually have the same understanding of each 8-RCS level.

Collectively, the findings of the present study suggest that federal and state police agencies in Germany would benefit from tailoring its risk communication process to accord with those of validated violent extremism threat assessment instruments. Like the 8-RCS, a number of empirically supported tools follow a semantic, ordinal approach to risk communication, including the Structured Assessment of Violent Extremism-30 (SAVE-30; Dean and Pettet 2017) and the Islamic Radicalization-46 (IR-46; Lloyd 2019), both of which use four risk categories. Following on from this, it is recommended that the number of levels in the current BKA scheme be reduced.

Given the variability of probabilistic estimates assigned in the present study by threat managers to threats classified in each level of the 8-RCS and given that no clear cut-off threshold was found for which level(s) would justify either covert tracking or overt confrontation by police, there does not appear to be a defensible reason for a risk communication scheme with eight discrete levels. Officials may wish to consider reducing the number of levels to four, as four-level semantic, ordinal violent extremism risk communication approaches are already used by police agencies in neighboring countries such as Belgium (cf. Ahmed et al. 2021). Each of the four levels of the updated BKA scheme would correspond to a predefined action: Level 1 would imply the need for overt confrontation, Level 2 would imply an urgent need for covert tracking, Level 3 would imply a non-urgent need for covert tracking, and Level 4 would imply a lack of need for threat management intervention. This reduction in categories from eight to four would retain the major benefits of the 8-RCS, including its high levels of perceived helpfulness in managing violent extremist threats, its serving as a nationwide objective standard for communicating risk for individual cases, and its providing a reference against which to compare an officer’s own threat assessment.

Limitations

There were four principal limitations of the present investigation which suggest that findings should be regarded as preliminary until independent cross-validation studies can be conducted. First, participants were not asked in which police agency they worked so as to maintain anonymity and increase the likelihood of truthful responses. However, this limited our ability to determine the representativeness of the sample, as it is possible that a disproportionate number of participants came from a single German state or department of the BKA. Further, non-respondents differed significantly from respondents regarding their level of threat management experience, in that they reported less experience in threat management, thus leading to an over-representation of well-experienced individuals. Results might therefore be different in a sample of less experienced threat managers. Moreover, respondents who did not answer all questions of the survey may have differed systematically from respondents in their beliefs about the 8-RCS.Footnote 5 Second, information was not available on how many threat management team members there are in Germany. As such, the sampling frame and, hence, the response rate of the survey could not be calculated. This methodological limitation is shared with previous surveys of German police officers (Clemens et al. 2020; Lorey and Fegert 2022; Rüdiger and Rogus 2014). Third, participants’ level of numeracy was not measured but could have been an important moderating factor with regards to the perceived probability of a future threat across each level of the 8-RCS (cf. Dieckmann et al. 2009). Fourth, content analysis of case formulations is needed to see whether, in addition to likelihood as expressed by the 8-RCS, there is also information on the predicted severity, frequency, and imminence of that violence, as well as whether a particular case should be prioritized. In addition to these principal limitations, motivational, comprehension, reactivity, and software-related errors common to all probability-based electronic surveying methods also apply to the present survey (for a review, see Couper 2000).

Conclusion

Threat management teams in the German police specializing in violent extremism are expected to act upon unstructured professional judgments of risk, resulting in intervention decision-making that is vulnerable to cognitive biases. Given the state-of-the-art in adjacent behavioral health fields such as forensic mental health and corrections, adopting a four-category risk communication scheme similar to those used by validated threat assessment instruments would increase the reliability and, hence, the accuracy of the updated system. Further, developing a complementary alert system whereby different levels of intervention are triggered by classification into one of the four categories would improve the efficacy of resulting threat management plans through clear and actionable risk communication. The help of dedicated agents of change including policymakers as well as internal thought leaders in the BKA is needed to facilitate these procedural changes and ensure fidelitous uptake (Carroll et al. 2007). Particular outreach efforts should be made to younger threat managers, who the present study identified as believing the 8-RCS and case formulations are less helpful than older threat managers. Although it will be a challenge to implement this systemwide shift, there will be significant and measurable benefits to both public safety as well as national security.