Background

The risk of cross-border transmission of infectious disease pathogens increases with the rise in global travel of people and transfer of goods [1]. Travelers, goods or vectors infected in one place could transmit diseases to other travelers during their journey or infect the population in the country of destination. Locally, at points of entry (POEs) – airport, ports and ground-crossings – management of high numbers of infected or exposed travelers can be challenging and would have a significant economic impact. During the SARS and current COVID-19 pandemics, for example, entry- and exit screening was implemented at POEs worldwide [2, 3], as was contact tracing performed for hundreds of travelers [4]. Capacities and procedures for management of public health events at desginated POEs have been agreed by the WHO State Parties in the International Health Regulations (IHR) 2005 [5]. However, translating capacity into an appropriate, timely, and efficient response to cross-border spreading requires collaboration and communication between many disciplines, levels, and countries [6], and subsequently, ongoing efforts to stay prepared. To support many POEs at the same time, many partners, the World Health Organization, and the European Union have been organizing multi-national training programs and simulation exercises [7, 8].

Despite all these efforts, we currently have no insight into the different education, training, and exercises (ETEs) that are carried out on POEs and what their effect is. A literature review in 2017, studying training on infectious disease control, reported that the included studies contained insufficient detail on the methodologies of training and did not report any results [9]. To employ future efforts (time, costs, intentions) as efficient as possible, we integratively reviewed the available scientific literature [10, 11] to identify 1) the different ETE methodologies to train professionals in infectious disease management, 2) how these ETEs are evaluated and 3) what evidence is available for their effectiveness, with a particular attention on cross-border settings, such as POEs.

The theoretical framework

To research the existing body of literature, we built a theoretical framework based on integrated theories and principles of effective teaching and learning. We combined the seminal Kirkpatrick [12, 13], Input Process Outcome [14] and Context Input Reaction and Outcome models [14, 15], the principles of adult learning [16, 17], the Self-Determination Theory on motivation [16], and techniques supporting sustainability [18]. In short, our framework states that ETEs rely on their context, input, and process and results in outcomes that can be evaluated at several points in time, and at four different levels (Fig. 1). The extensive theoretical background can be found in Additional file 1.

Fig. 1
figure 1

The context, input, and process affect the outcome of education, training and exercises. Outcome of education, training, or exercises can be evaluated at four levels (Kirkpatrick 1996). Lower levels are easier assessable, while higher levels show better sustainability of outcomes

Context, input, and process

The context comprises the environment of the learner [13]. This context influences learning, and mainly the application and implementation of what is learned. An example is the participants’ ability to change existing practices in a larger system. Learning in a context that welcomes change stimulates learning and its application. Other contextual factors are the workload, the training needs, and the autonomy of learning and application for the specific target group. A context of specific interest is the cross-border setting, here defined as a setting with interaction between different nation-states, such as in border regions, points of entry, or other multi-country settings.

The input covers the external conditions of the ETE, such as the thoroughness and quality of the material development, participants’ prior knowledge, the ETE topic, and the facilitators’ experience [13, 14]. Regarding this last factor, the training-of-trainer (TOT) approach is of interest. In a TOT design, participants are raised as trainers or facilitators to deliver ETEs themselves, through which the reach of an ETE can be enlarged. However, the trainers’ quality should remain on a sufficient level.

The process comprises the implementation and design [13]. Either more classical designs are used, such as education based on presentations, training with workshops, or table-top exercises; or more innovative designs are used that enlarge an ETE’s reach or enhance realism. Other process factors are clarity of learning goals, interactivity and problem-based learning, and the duration and frequency of learning moments.

Evaluation & Outcome

According to our theoretical framework, the context, input, and process affect the effectiveness or outcome of an ETE. Three evaluation moments are distinguished; the pre-test right before the ETE, the post-test right after the ETE, and the follow-up test one to several months after the ETE. The pre-test is used to set the baseline for learning, the post-test is used to see the direct and short-term effect of the ETE, and the follow-up test assesses the sustainability of the effect over time. Also, control groups are required to exclude external effects. The ETE outcome can be evaluated at four levels: reaction, learning, behavior, system (Fig. 1) [12, 13]. The reaction level assesses participants’ satisfaction, either quantitatively or on content. The learning level assesses the improvement of knowledge, skills, or attitudes. Although knowledge and skills are best assessed using tests or demonstrations instead of self-assessments, for this study, both these objective and subjective measures are interpreted as learning. The behavior level assesses the change in individual working practice. Because objectively measuring behavioral change is often complicated and time-consuming, we include both objective and self-assessed change at this level. On the system-level, change is organizational. Examples are standard operating procedures, contingency plans, or the information or communication flow through an organization. While reaction and learning are more easily assessable, behavioral, or system change indicates higher sustainability of the outcome [12, 13]. Although lower levels are indispensable in motivating, monitoring and purposefully investing in the professionals that make up the public health system, the system level addresses the public health roles from a macro perspective. Outcomes on this level are therefore most relevant from a public health perspective.

Education, training, and exercises

Based on our theory, education, training, and exercises are treated alike; these all aim at improving performance. Nevertheless, their differences are defined as follows: education is a process of individual learning in a general sense leaving several options for application available; training is a more practical and specified way of learning, also addressing practical aspects; exercises are a practical simulation of real practice.

Methodology

Literature search

To collect evaluations of ETEs in infectious disease control, we conducted a systematic, electronic search in the databases of Cinahl, Embase, Eric, Medline, PsycInfo, and Web of Science. The search period covered the period between the start of the databases (Cinahl: 1982; Embase: 1974; Eric: 1965; Medline 1946; PsycInfo: 1967; Web of Science: 1900) until 24 September 2018. We searched for a combination of “public health”, “infectious disease”, “cross-border”, “effectiveness”, “training” and their synonyms. The search strategy can be found in Additional file 2.

Inclusion criteria

First, we screened titles and abstracts and included studies that described an evaluation of an ETE with a topic in infectious disease control from a public health perspective, or if compliance remained unsure. Subsequently, studies’ full texts were screened. Studies were included if an evaluation of the ETE was described in the paper and public health professionals, either on the local, regional, or national level, were among the target population. Studies were excluded if no public health professionals were included as participants or when the topic was restricted to research, a specific therapy, such as the use of anti-virals in a therapeutic setting, or laboratory practice. An overview of the in- and exclusion criteria is shown in Fig. 2. The reference lists of included studies were screened for additional relevant studies, using the same criteria. For both the abstract- and full-text screening, the first 25% of studies was screened independently by two authors (DdR, EB) and compared afterward. Any disagreements between the authors were discussed until consensus was reached, before continuing with the other 75% (DdR). In total, 62 studies could be included.

Fig. 2
figure 2

Flowchart of the systematic literature search

Quality assessment

We assessed the quality of the ETE’s methodology and the quality of the evaluation for all studies. The first assessment was based on six questions from the Quality Standards in Education and Training Activities of the Youth Department of the Council of Europe 2016 [19], the second on six questions of the NICE Quality appraisal checklist for qualitative studies [20]. The quality assessment form can be found in Additional file 3. For both parts, a maximum of twelve points could be scored, leading to a bad, moderate or good score for tertiles. The first 25% of studies was scored independently by two authors (DdR, EB). After comparing and discussing the scores, one author continued (DdR).

Data extraction and analysis

We performed an integrative review, inspired by the steps of Whittemore and Knafl [10, 11]. First, we designed a data extraction form based on the theoretical framework (Fig. 1). Then, we extracted data on variables of context, input, process, and outcome, as shown in Tables 1 and 2, along with basic study characteristics, such as the journal, publication year, country, and funding issues. We analyzed the context, input, process and the four outcome variables by describing their occurrence and variety. Sub-analyses were performed for studies in a cross-border setting or with a TOT approach. If many studies described one of the four outcome variables, results were subdivided according to education, training, and exercises, or even between classical and innovative study designs. If directions of outcomes highly differed, we compared for context, input, and process characteristics.

Table 1 Baseline characteristics of the included studies

11

Table 2 Results

The results were presented in line with the theoretical framework. First, factors of context, input, and process were presented; after, the outcomes per level are described, if possible, referring to the context, input, and process characteristics. In this way, both an overview is generated of characteristics of ETE in infectious disease control, their accuracy in reporting, and possible links between the context, input, and process of ETEs and the outcome.

Results

Literature search

In Total, 2201 unique studies were identified. After applying the in- and exclusion criteria for titles and abstracts, 186 full-texts were screened, leading to 51 inclusions. Citation screening led to the inclusion of another 11 studies. Figure 2 shows the flowchart of the search and selection process. The quality assessment resulted in seven studies with a good score for training (score ≥ 9), and 23 with a good score for the evaluation (score ≥ 9). Ten studies had a good quality score after combining the scores (score ≥ 17). All scores can be found in Additional file 4.

Context

Five studies covered ETE in a cross-border setting, either a border region (n = 2), a point of entry (n = 2), or a multi-country setting aimed at international cooperation (n = 1). All other ETEs were in a non-cross border setting.

Target group

The target group of the ETE varied among studies, but was often improperly described in the studies. Examples are ‘public health leaders’, and ‘all staff of regional health departments’. Other studies specified a wide variety of professionals with different tasks in emergency preparedness or mixed public health professionals with emergency responders, university staff, and civilians. Participants’ motivations to participate are hardly derivable.

Recruitment & Autonomy

The majority of studies left any recruitment technique or clarified participants’ motivation unnoticed. Three studies reported mandatory participation, six studies highlighted the free choice of people participating, and two reported on freely available online courses. In Hoeppner et al., participants had to apply for participation, thereby suggesting motivation [49]. Fowkes et al. 2010 formulated their highly motivated participants as a limitation in the interpretation of their identified effectiveness of the ETE [45].

Training needs

In total, eleven studies performed a training needs assessment among the target population before designing the ETE. Also, training needs were obtained via literature studies, the ETE designers’ experienced-based vision [23], or by inquiry of disaster plans and local emergency management policies [55, 69]. Several studies specifically aimed to identify gaps and needs through the exercise [56, 68].

Input

Training topic

The studies discussed a wide variety of ETE topics. Twenty-three studies focused on preparedness and seventeen on response. The main topics were bioterrorism (n = 8), a pandemic (n = 8), or a specific disease outbreak (n = 9), of which five focused on influenza (n = 5). Odd ones out were among others training on risk communication [41], leadership [64], and one health [33]. Five studies, all TOTs, incorporated didactics as a training topic.

Trainers

A minority of studies indicated to have competent, experienced trainers or facilitators (n = 18). A majority of studies described the trainers without showing their experience or competence, by generally describing them as “instructor” or “university staff”, or left trainers completely unreported (n = 30).

Development & quality of the material

The development of learning material was discussed in all but seventeen studies. Most theories were derived from constructivist learning principles, such as the Adult Learning Theory [37, 60], or problem-based learning [81]. Other used theories included the Dreyfus model [59], theory from Benner [49, 59], continuing education [28, 59], and blended learning [36]. ETEs were also based on existing competencies [37, 44, 50, 71], previously existing materials, and developers’ experience from previously performed training or exercises. The developers of the material were mostly public health professionals (n = 12), followed by people from universities or public health schools (n = 10). The help of higher departments, such as from ministry level, the national center for disease control, or the WHO, were named several times [32, 63, 74]. In two studies, graphical designers were involved in the development of realistic images or virtual environments [76, 82].

Process

Classical designs

Eight studies described educational programs as part of university programs or courses, of which Yamada et al. describe an interdisciplinary and problem-based methodology during education [81], and Orfaly & Biddinger et al. and Rega et al. integrated table-top exercise in university courses [61, 67]. In the other six studies, methods were weakly described, merely referring to university programs or courses.

Nineteen studies evaluated a training of which several combined their training session with an exercise [25, 65] or real-life project [64]. Two studies left their training methodology unspecified [35, 42]. Of the other studies, all except one supported interactivity among learners or between learners and trainers by referring to interactive lectures or discussion. Detailed descriptions of training designs lacked and were restricted to summarizing words such as “using participatory methods” [31] or “an online lecture” [36, 63]. Studies delivering any detail on methodology refer to the adult learning principles, active learning, interactivity, multi-disciplinarity, or participatory methods, and explicitly away from passive methods.

Exercises were described in 24 studies, of which sixteen were table-top exercises and six simulation exercises specifically. The most common elements of table-top exercises in these studies were a lecture beforehand; a presentation of the scenario; an initial individual response; a pre-arranged and guided discussion in small, multi-disciplinary groups of local partners. Subsequently, a presentation in a larger group and a debriefing followed. Often, more than one scenario was included in the exercise. Most considerable differences between studies are the detail level of described methodology, and whether individuals, small- or large groups have to respond. Again we see more detailed study descriptions for studies that refer to the adult learning principles.

Innovative design - wide reach

Seven studies had a TOT design, of which three integrated the second wave of training. This second wave was delivered by the TOT participants [33, 54, 74], whereupon participants could immediately apply what was learned. All TOTs contained mixed methods. Often passive methods, such as lectures or presentations, were combined with active methodologies, such as guided discussions, clinical training, or active presenting. For two TOTs, the used ETE methodologies were largely unknown.

Seven studies studied ETE with online or new methodologies such as a virtual reality training [76], audience response system [77], the use of the intranet for training [52], e-modules [28, 31, 50], and combinations of e-learning and on-site learning [36]. Online ETEs had natural opportunities to spread the learning moments over a longer period. Also, participants were able to follow the ETE at their own pace. Some simulation exercises also used online methodologies in the form of blog websites where participants had to respond from their office to signals [21, 22, 52].

Innovative design - enhanced realism

Elements that were described to enhance the feeling of reality were among others the use of real work locations such as at an airport [56]; a computer simulation model generating feedback depending on participants’ decisions in a simulation exercise [26, 27]; interaction with scenario cards guiding each exercise to different possible outcomes [70]; initial ambiguity in an exercise case and drop-out of participants during the exercise [71]; moulaged or simulating patients [29]; and external consultations of experts during the exercise [75]. Rega & Fink 2013 report on a semester-long simulation exercise to keep up a realistic time frame [67].

Duration, interval & goals

General duration of ETEs varied between 30-min training and years-long curricula. TOTs mostly lasted several days to weeks. Educational courses lasted between 14 h and two years, training between 14 h and one year. Fifteen studies did not elaborate on the duration of the ETE. The interval and time between intervals are hardly described. The goals of ETE were addressed in most studies (n = 47), although often stated on the organizational level or implicitly integrated into the text instead of presenting trainable and measurable competencies. An overview of the outcomes on context, input and process are shown in Table 1.

Evaluation & Outcome

System-performance

System-performance was evaluated by four studies that used participants’ evaluations of organizational achievements after the ETE [32, 38], or external evaluations [45, 48]. None of the studies assessed the system effects of ETEs in a cross-border setting. Becker et al. 2012 evaluated a postgraduate education curriculum after two years in a developing setting [32]. This curriculum impressively increased the local public health system. The three other studies (n = 682; 1496; unknown) evaluated several table-top- and simulation exercises. These exercises seem effective on the system level regarding improving a prepared workforce by emergency planning [45], relationships among colleagues [48], and communication systems [48]. Potter et al. 2005 did not aim to evaluate system-performance but had a coincidental finding on this level: right after the training period, a real infectious disease outbreak occurred. According to the involved professionals, the response was well managed because the members of the response team had become acquainted with each other during the training [64].

Behavior

Nine studies, including two TOTs [60, 62], evaluated the outcomes on a behavioral level. Evaluation of behavior was primarily timed directly after the ETE, while six studies performed an additional follow-up test. Behavioral change was mainly self-assessed by participants, leading to subjective measurements. In one study, local supervisors were appointed to assess trainees’ behavioral change [36]; another used a report on ministry level next to participants’ self-assessments [40]. No control-groups were used.

The educational curricula seem to change behavior such as initiating the updating of plans, expanding professional networks, and improving collaboration (n > 244). Table-tops lead, according to ministries’ reports, to increased development of further exercises and a more regular assessment of public health preparedness (n = unknown). Online modules had a low response rate (< 18%), but changed behavioral intentions among responding participants (n > 55) [28, 63]. According to local supervisors (n = 511), the combination of online learning and on-site training led to improved work performance. One study reported on behavioral change after table-tops in a multi-country setting but did not mention any result in interaction between countries [40]. According to Orfaly et al. 2005 and Otto et al., TOTs seem moderately effective, since 20 and 44%, respectively, conducted exercises after six months (n = 118; n = 168) [60, 62].

Learning – knowledge

Thirty-three studies used knowledge to evaluate the effect of an ETE, including four TOTs, and four ETEs in a cross-border setting. The majority of knowledge was evaluated in pre- and post- knowledge tests (n = 20) compared to self-assessments of knowledge using Likert scales. Compared to studies using knowledge tests, those using self-assessments reported more detail on how knowledge had improved. Knowledge particularly improved on organizational and functional content, such as understanding response protocols or describing functional roles or the chain of command within an organization. This is understandable since self-assessments can explicitly ask what they aim for, while knowledge tests can only provide a test score. No control-groups were used, one study compared two groups that were exposed to two different methodologies [76].

Knowledge shows a clear increase directly after ETEs. The five studies that used knowledge tests, performed follow-up tests and reported the results show a scientificly significant improved knowledge level directly after the ETE and up to 12 months after [42, 76, 78,79,80]. Response rates were unknown, and the duration of these ETE programs varied between fourteen hours and four weeks. Umble et al. showed equal increase in knowledge between classical education and a broadcast [76]. Regarding ETEs in a cross-border setting, all using mixed methods and clearly stated their goals, knowledge increase was shown after table-tops and training. However, these studies used self-assessments or unknown scoring methodologies.

Learning – skills

Twenty-one studies evaluated an ETE on skills, including three TOTs but none in a cross-border setting. Practiced skills vary from a majority of organizational, communicational, team, and leadership skills, to a minority of more medical skills such as surveillance or the use of personal protective equipment. Except for one study using skill demonstrations [74], most studies performed self-assessment of improvements comparing pre- and post-tests. Seven studies also performed a follow-up test.

According to participants’ self-assessments, all ETEs were effective skill-builders. A statistically significant increase in skills is shown for training, while this outcome remains insignificant for most tabletop- and simulation exercises. Follow-up evaluations indicated even a further increase in skills in the period after the ETE, although these results are self-assessed and mainly statistically insignificant. Two TOTs showed a significant increase in planning, implementation, and evaluation after a table-top exercise [33, 54]; follow-up results were unavailable here.

Learning – attitude

Fifteen studies reported on a change in attitude, including one for a TOT [43], and one for several table-tops in a cross-border setting [40]. The evaluated attitudes comprised the awareness of and motivation to develop future preparedness plans and programs, or an increase in confidence. We saw mainly training and exercises evaluating attitude. Attitude was assessed by rating statements.

We saw a sustainable change in attitude directly and 1–3 months after both online and face-to-face training. These training programs lasted between 1,5 and 14 h but had unclear methods. Table-top exercises varied in their capability to change attitude, since both significant change [72, 75] and fairly indifference [34, 71] was shown, indicating that more detailed evaluation is required. The table-tops in a cross-border setting seemed to enhance participants’ motivation to develop and exercise programs [40]. Dickmann et al. 2016 reported a relation between knowledge and attitude: participants with higher knowledge also had congruent confidence levels to respond and advocate for change [41]. Data regarding TOTs do not suffice aggregation of results.

Reaction

Forty-five studies assessed ETE on the reaction level, mostly by participants rating statements on satisfaction and methodology using Likert scales, directly after the ETE. The ETEs in crossborder settings show high satisfaction among participants regarding table-tops and simulation exercises. One TOT showed satisfied participants of the second wave of training. We will present the results for different designs.

Training programs scored satisfactorily directly after the training, despite the substantial differences in design: after a 30-min pandemic preparedness training [46], 98% of participants thought the program valuable, as thought 95% after several face-to-face modules on emergency preparedness [44], and 92–96% after a preparedness training of 14 days [78, 80]. Remarkably, the one study performing a follow-up test identifies the lowest satisfaction of all training programs, with a mean score of 4/5 after a 2-day Zika response training [35].

Only one study evaluated reaction after an exercise with a follow-up test [40], all others were restricted to post-tests. Table-top exercises overall scored high on satisfaction, mainly based on their potential to practice together (77% agreed [34]), to build relationships (80–90% agreed [58]); to improve emergency or contingency planning (73% agreed [34]); and to identify gaps (89% [62] and 77% [58] agreed). Biddinger et al. identified higher satisfaction among regional exercise respondents compared with single institution respondents regarding their understanding of agencies’ roles and responsibilities (p < 0.001), engagement in the exercise (p = 0.006), and satisfaction with the combination of participants (p < 0.001) [34]. The right combination of participants was in several studies scored as one of the most valuable aspects. A disadvantage of table-top exercises was the lack of identification of key gaps in individuals’ performance [40]. Further made recommendations for exercises were: to clearly formulate specific objectives; to be as realistic as possible; to ground practical response in theory; to be designed around issue-areas rather than scenarios; to have a forced, targeted and time delineated discussion and decision making; to have limited number of participants but to include all key perspectives and especially leadership perspectives; to be collaboratively designed and executed with representatives from participating agencies, external developers, and facilitators; to have networking possibilities; and to use trained evaluators.

Simulation exercises were less assessed on reaction, and outcomes show a slightly lower satisfaction than the table-top exercises. However, in three studies, “most participants” or over 80% of participants still agreed on their readiness being increased by simulation. The full-scale simulation at an airport stresses the need for specific goals, in this way preventing deprioritizing the public health response by trying to test everything at the same time [56]. Also, it is paramount to have clear roles and responsibilities of the various agencies involved, and to have all required capacity available [56]. One study showed a positive relationship between the duration and the contact and communication between health departments after a joint exercise [22].

Ten studies reported reaction directly after innovative methodologies. Several studies added online blogs, pages, or systems to a simulation exercise [22], a lecture [50], or a combination of classical designs. Other studies evaluated pure technologies such as an audio-response system [77], or a virtual reality environment [82]. For innovative methods, satisfaction was generally high, although technical issues were often reported. For example, the e-modules in Baldwin et al. were launched via the intranet of a public health organization [30], thereby benefitting from high accessibility but facing extensive, unforeseen updates, a rigidness for change and delayed updated because ownership was not designated. The VR environment exercise met its objectives and was time well spent, but the participants and authors suggest further technology innovations before this method can be used at large scale [82]. An overview of all outcomes, including those not mentioned above [24, 39, 47, 51, 53, 57, 66, 73], are shown in Table 2.

Conclusions

This study aimed to review the different ETE methodologies that are used by professionals in infectious disease management, how these methodologies are evaluated, and what their effect is. We have a particular focus on cross-border settings, such as POEs, and methodologies with a wide reach. We identified various types of ETEs – from nationwide online preparedness programs till the hands-on local trainings during an outbreak - but with generally few details on the exact methodology. Both the lack of details and the predominance of short-term and subjective evaluations impede conclusions on what methods in which settings lead to both positive and sustainable outcomes. Our results point out the need for standardized evaluations, preferably with a long-term scope, that are shared among trainers and organizers. We developed a theoretical framework that can be used to structure future evaluations. These evaluations, then, will hopefully not only inspire future developers to come up with successful ETE designs but also lead to recommendations for the best exercise-effect ratio.

Reports on system and behavioral level outcomes are scarce, leaving us with a majority of lessons learned on lower outcome levels as learning and reaction. While the convincing and sustainable increase in knowledge and skills are hopeful indications for system improvements and support the use of ETEs for learning, several intervening factors are possible. Among others, evaluation tests in itself are one of the most sustainable learning techniques [18]. The knowledge tests and demonstrations activate knowledge and skills and might be responsible for the effect. Also, control groups are missing while often immediate causes are seen for the organization of an ETE, such as a growing pandemic or a recent bioterrorism attack. These events require ETEs, but might also lead to greater attention for and learning about the subject despite any ETE. We saw a learning effect that increases during follow-up and is independent of ETE duration, which is further supporting this confounding effect.

While cross-border infectious disease control receives international attention, and in Europe alone, almost all countries have designated POE to be prepared to handle cross-border health threats, we identified only five studies describing an ETE evaluation in a cross-border setting. This is too low a number to draw general conclusions about the effectiveness of ETEs in a cross-border setting. Findings from ETEs in general infectious disease management should be used for this setting untill more specific evaluations are available. However, one crucial difference between the cross-border and non-cross-border setting is the larger and more diverse set of stakeholders that are involved in cross-border settings. Not only several countries are involved, but also information and cooperation are needed between general public health, health professionals, and specific port, airport and ground-crossing officials. While the studies in cross-border settings did not elaborate on their specific settings, many other studies in our review identified a strengthened network, better knowledge of roles & responsibilities, and enhanced relations among the most valuable aspects of training and exercises. In other disciplines, it was also discovered that sharing the same language [83], the focus on relationships, and collaborative management skills [84] are essential factors of collaborative learning. We consider these findings as prudent support for training and exercising in cross-border settings.

Because cross-border health threat prevention requires collaboration between countries and a shared minimum level of functioning, we considered TOT approaches and online methodologies for their potential to reach several locations and a large audience at once. Both methodologies have potential and some remaining challenges. TOTs seem as effective in learning as other training methods, and their participants are satisfied. Unfortunately, TOTs are only moderately effective regarding their principal goal: the organization and delivery of future ETEs. In this way, the potential exponential increase in delivered training sessions and trainees compared to single direct training remains limited. To reach their potential, the barriers that TOT participants perceive, such as a lack of confidence, time or resources for ETE delivery, or other priorities during their duties, should be taken seriously in future TOT programs. Online methodologies overcome specific barriers that were identified for TOT approaches. Both in our study as in another recent review on undergraduate medical education indicate that online learning “enhances knowledge and skills”, while evidence is lacking “that offline learning works better” [85]. However, technical issues and a lack of ownership of the online environments are remaining barriers. Also, we only had a low number of studies to evaluate. We call for more, enhanced evaluation of ETEs using innovative and online methods, which is stressed recently by other reviews and the WHO [86, 87].

This review has several strengths and limitations. First, we restricted our analysis to what was available in the peer-reviewed literature databases and did not study the body of grey literature. Although it is very probable that more evaluations are performed, orienting searches in the grey literature yielded a limited amoung of evaluations, indicating that the majority of ETEs in a crossborder setting are not made public. However, the theoretical framework developed in this study can be used on a wide variety of ETEs, including those not publicly available within public health organizations. Furthermore, this theoretical framework can be used to support the design and evaluation of ETEs, and a more complete reporting in the peer-reviewed literature.

Second, the didactic scope of our review can be seen both as a limitation and a strength. The collaborative evaluation of education, training and exercising leads to broad and generic conclusions possibly limiting conclusions on individual ETEs whose goals widely vary among each other. However, restricting our results to either education, training or exercising specifically is also problematic. Although general distincitions are possible, on an individual level these are often arbitrary. As our results show, exercises are often taken as part of training or educational programs. For example, organization-wide exercises are used for training on the system level – is an organization prepared to respond effectively -, but training is also used for the handling of individual patients. We, therefore, chose to evaluate all three ways of learning, using the same four levels of evaluation. In this way, the results in this study do not only display the effect of the ETEs, but also specify these to the evaluation level.

Thirdly, we restricted our focus on infectious disease prevention and control in a public health setting. While public health responses to chemical, radiological or nuclear threats demand another set of professionals, they share many of the aspects of contamination and could be included in future reviews. The theoretical framework as developed in this study, may be well applicable to use for evaluations in these adjacent disciplines. We consider it a strength that, to the best of our knowledge, this is the first attempt to assess ETEs in infectious disease control systematically. In addition to previous efforts [9], we studied evaluations and outcomes with greater detail and with the comprehensive framework we developed, we have contributed to the body of knowledge regarding the performance of systematic reporting and evaluation of ETEs.

Future studies should focus on the development of a standardized evaluation format integrating details of context, input, and process and suggesting planning and questionnaires for evaluations. Future training developers should first focus on the formulation of clear ETE goals, then attach the required outcome level and subsequently choose the appropriate evaluation methods. For example, if one intends to improve an airport’s capability to prevent secondary transmissions during a case of tuberculosis on a plane, then this goal is formulated on the system level. However, if the goals on the system level are not met, the formulation of goals and evaluations of outcomes on individual behavior, knowledge, skills are required to what and who should be further supported. Choosing the appropriate evaluation methods might involve requesting access to track-records, including external observators, planned skill demonstration, or validated knowledge tests. Lastly, we highly recommend sharing evaluations and lessons learned of ETEs on a broad scale to directly support co-organizers and provide policymakers with the chance to deploy costs, time, and capacity towards the optimal effect. By standardizing the evaluation of ETEs, comparisons with methods in general adult education would become possible and provide an even broader base for recommendations on effect. We call for international efforts to facilitate this sharing of evaluations and experience, for example, through maintaining a sustainable electronic training platform where standard information can be registered, and exchanged about the set-up, implementation and evaluation of ETEs. A standard set of scenarios in cross-border setting or training materials could further encourage this development. Previous time-restricted projects have shown their potential [88], but a sustainable option has been missing.

We conclude that although extensive training and education programs exist in infectious disease control, recent literature can only partly and prudently prove their added value, especially in cross-border settings. We see promising results for online methodologies reporting similar results as offline training, although relationship building and networking are among the aspects most valued by participants of face-to-face training. Above all, future developers of ETEs should not forget the long-term perspective of their efforts; sharing the evaluations benefits a crowd of colleague organizers from detailed and thorough reporting and evaluation. This paper, therefore, presents a call for publishing ETE evaluations in order to facilitate overall system learning and preparations of a workforce that can cope with the perpetual challenges of global infectious disease control.