Background

As a medium for deliberate reflective practice, debriefing is commonly cited as one of the most important aspects for learning in immersive simulation-based education (SBE) [1,2,3]. Defined as a ‘discussion between two or more individuals in which aspects of performance are explored and analysed’ ([4], p., 658), debriefing should occur in a psychologically safe environment for learners to reflect on actions, assimilate new information with previously constructed knowledge, and develop strategies for future improvement within their real-world context [5, 6]. Debriefings are typically led by facilitators who guide conversations to ensure content relevance and achievement of intended learning outcomes (ILOs) [7]. The quality of debriefing is thought to be highly reliant on the skills and expertise of the facilitator [8,9,10,11], with some commentators arguing the skill of the facilitator as the strongest independent predictor of successful learning [2]. Literature from non-healthcare industries tend to support this notion, suggesting that facilitators enhance reflexivity, concentration, and goal setting, whilst contributing to the creation and maintenance of psychological safety, leading to improved debriefing effectiveness [12, 13]. However, this interpretation is not universally held and has been increasingly challenged [14,15,16,17,18].

It is in this context that there has been an emergence of self-led debriefings (SLDs) in SBE. There is currently no consensus definition of SLDs within the literature, with the term encompassing a variety of heterogenous practices, thus causing a confusing narrative for commentators to navigate as they report on debriefing practices. We have therefore defined ‘self-led debriefing’ as debriefings conducted by the learners themselves without the immediate presence of a trained faculty member. Several reviews have investigated the overall effectiveness of debriefings, with a select few drawing comparisons between SLDs and facilitator-led debriefings (FLDs) as part of their analysis [2,3,4, 7, 10, 17, 19,20,21,22]. The consensus from these reviews is that there is limited evidence of superiority of one approach over the other. However, only four of these reviews conducted a critical analysis of the presence of facilitators within debriefings [2, 19, 20, 22]. Moreover, in one review [19], a narrow search strategy identified only one study comparing SLDs with FLDs [14]. To our knowledge, only one published review has explored SLDs specifically, investigating whether the presence of a facilitator in individual learner debriefings, in-person or virtual, impacted on effectiveness [23]. Within these parameters, the review concluded equivalent outcomes for well-designed SLDs and FLDs, however did not explore the influence of in-person SLDs on debriefing outcomes for groups of learners in immersive SBE. The value and place of SLDs within this context, either in isolation or in comparison with FLDs, therefore warrants further investigation.

Within the context of immersive SBE, and in-person group debriefings specifically, we challenge the concept of ‘one objective reality’, instead advocating for the existence of multiple subjective realities constructed by individuals or groups. The experiences of learners influence both their individual and group perceptions of reality and therefore different meanings may emerge from the same nominal simulated learning event (SLE) [24]. As such, this study has been undertaken through the lens of both constructionism and constructivism, with key elements deriving from both paradigms. Constructionism espouses the profound impact that societal and cultural norms have on determining how subjective experiences influence an individual’s formation of meaning within the world, or context, that they inhabit [25, 26]. Constructivism is a paradigm whereby, from their subjective experiences, individuals socially construct concepts and schemas to cultivate personal meanings and develop a deeper understanding of the world [26, 27]. In the context of in-person group debriefings, the creation of such meaning, and therefore learning, may be shaped and influenced by the presence or absence of facilitators [24].

The discourse surrounding requirements for facilitators and their level of expertise in debriefings has important implications due to the resources required to support faculty development programmes [2, 8, 9, 28]. SLDs are a relatively new concept offering a potential alternative to well-established FLD practices. Evidence exploring the role of in-person SLDs for groups of learners in immersive SBE is emerging but is yet to be appropriately synthesised. The aim of this integrative review (IR) is to collate, synthesise and analyse the relevant literature to address a gap in the evidence base, thereby informing simulation-based educators of best practices. The research question is: with comparison to facilitator-led debriefings, how and why do in-person self-led debriefings influence debriefing outcomes for groups of learners in immersive simulation-based education?

Methods

The traditional perception of systematic reviews as the gold-standard review type has been increasingly challenged, especially within health professions educational research [29]. An IR systematically examines and integrates findings from studies with diverse methodologies, including quantitative, qualitative, and theoretical datasets, allowing for deep and comprehensive interrogation of complex phenomena [30]. This approach is particularly relevant in SBE, where researchers employ a plethora of study designs from differing theoretical perspectives and paradigms. An IR is therefore best suited to answer our research question and help satisfy the need for new insights such that our understanding of SBE is not restricted [31].

This IR has been conducted according to Whittemore & Knafl’s framework [30] and involved the following five steps: (1) problem identification; (2) literature search; (3) data evaluation; (4) data analysis; and (5) presentation of findings. Whilst the key elements of this study’s methods are presented here, a detailed account of the review protocol has also been published [24]. The protocol highlights the rationale and justification of the chosen methodology, explores the underpinning philosophical paradigms, and critiques elements of the framework used [24].

Problem identification

We modified the PICOS (population, intervention/interest, comparison, outcome, study design) [32] framework to help formulate the research question for this study (Table 1), supplementing the ‘comparison’ arm with ‘context’ as described by Dhollande et al. [33]. This framework suited our study in which the research question is situated within the context of well-established FLD practices within SBE. Simultaneously, it recognises that studies with alternative or no comparative arms can also contribute valuable insights into how and why SLDs influence debriefing outcomes.

Table 1 PICOS framework [32, 33] used to construct research question

Literature search

Search strategy

Using an extensive and broad strategy to optimise both the sensitivity and precision of the search, we searched seven electronic bibliographic databases (PubMed, Cochrane, Embase, ERIC, SCOPUS, CINAHL Plus, PsychINFO), up until and including October 2022. The search terms are presented below in a logic grid (Table 2). Using a comparator/context arm minimised the risk of missing studies describing SLDs as what they are not—i.e. ‘without a facilitator’. A full delineation of each search strategy, including keywords and Boolean operators, for each electronic database is available (Additional file 1). Additionally, we conducted a manual search of reference lists from relevant studies and SBE internet resources. We enlisted the expertise of a librarian to ensure appropriate focus and rigour [34, 35].

Table 2 Logic grid aligned with the PICOS elements of the review question, omitting outcome/study design categories [33,34,35]

Inclusion and exclusion criteria

Articles were included in this review if their content met the following criteria: (1) in-person debriefings following immersive simulated learning events; (2) debriefings including more than one learner; (3) healthcare professionals or students as learners; (4) published peer-reviewed empirical research; (5) reported in English. Forms of grey literature, such as doctoral theses, conference or poster abstracts, opinion or commentary pieces, letters, websites, blogs, instruction manuals and policy documents were excluded. Similarly, studies that described clinical event, individual, non-immersive or virtual debriefings were also excluded. Date of publication was not an exclusion criterion.

Study selection

Following removal of duplicates using bibliographical software package EndNote™ 20, we screened the titles and abstracts of retrieved studies for eligibility. Full texts of eligible studies were examined. Application of the inclusion and exclusion criteria determined which of these studies were appropriate for inclusion in this IR. We used a modified version of the Preferred Reporting Items for Systematic Review and Meta-Analysis Protocols (PRISMA) reporting tool [36] to document this process.

Data evaluation

The process of assessing quality and risk of bias is complex in IRs due to the diversity of study designs, with each type of design generally necessitating differing criteria to demonstrate quality. In the context of this complexity, we used the Mixed Methods Appraisal Tool (MMAT) which details distinct criteria tailored across five study designs: qualitative, quantitative randomised-controlled trials (RCTs), quantitative non-RCTs, quantitative descriptive and mixed methods [37].

Data analysis

Data was analysed using a four-phase constant comparison method originally described for qualitative data analysis [38, 39]. Data are compared item by item so that similar data can be categorised and grouped together, before further comparison between different groups allows for an analytical synthesis of the varied data originating from diverse methodologies. These phases include (1) data reduction; (2) data display; (3) data comparison; and (4) conclusion drawing and verification [30, 38, 39]. Following data reduction and extraction, we performed reflexive thematic analysis (RTA) according to Braun & Clarke’s [40] framework to identify patterns, themes and relationships that could help answer our research question and form new perspectives and understandings of this complex topic [41]. RTA is an approach underpinned by qualitative paradigms, in which researchers have a central and active role in the interpretative analysis of patterns of data and their meanings, and thus subsequent knowledge formation [40]. RTA is particularly suited to IRs exploring how and why complex phenomena might exist and relate to one another, as it enables researchers to analyse diverse datasets reflexively. It can therefore facilitate the construction of unique insights and perspectives that may otherwise be missed through other forms of data analysis. A comprehensive justification, explanation and critique of this process can be found in the accompanying IR protocol [24].

Results

Study selection and quality assessment

The search revealed a total of 1301 publications, of which 357 were duplicates. After screening titles and abstracts, 69 studies were identified for full-text screening. From this, a total of 18 studies were included for data extraction and synthesis (Fig. 1). Reasons for study exclusion are listed in Additional file 2.

Fig. 1
figure 1

Modified PRISMA flow diagram detailing summary report of search strategy [36]

All 18 studies were appraised using the MMAT (Table 3). Five questions, adjusted for differing study designs, were asked of each study, and assessed as ‘yes’, ‘no’ or ‘can’t tell’. The methodological qualities and risk of bias within individual studies impacted the analysis of their data and the subsequent weighting and contribution to the results of this review. The quality assessment process therefore influences the interpretations that can be drawn from such a collective dataset. Whilst the studies demonstrated varying quality, scoring between 40 and 100% of ‘yes’ answers across the five questions, no studies were excluded from the review based on the quality assessment. There were wide discrepancies in the quality of different components of the mixed methods studies. For example, Boet et al. [15] scored 0% for the qualitative component and 100% for the quantitative component of their mixed methods study. The quantitative results were therefore weighted more significantly than the qualitative component in the data analysis and its incorporation into the results of this review. Meanwhile, Boet et al.’s [16] qualitative study scored 100%, thus strengthening the influence and contribution of the results from that study within this IR.

Table 3 MMAT data evaluation of included empirical studies [37]

Study characteristics

Key characteristics of articles, including the study aim and design, sample characteristics, descriptions of SLE and SLD formats, data collection instruments, and key reported study findings, are summarised in Table 4. The search elicited one qualitative study, eight quantitative RCTs, six quantitative non-RCTs, one quantitative descriptive study and two mixed methods studies. All 18 studies originated from socio-economically developed countries with six studies originating from South Korea [44,45,46, 51,52,53], five from the USA [42, 43, 48, 54, 55], and the remainder from Canada [15, 16], Australia [56, 57], Spain [49, 50], and Switzerland [47]. Two studies were multi-site [51, 52]. The immersive SLE activities were of varying formats, designs, and durations. Sixteen studies described team-based scenarios [15, 16, 42, 44,45,46,47,48,49,50,51,52, 54,55,56,57] whilst two used individual scenarios [43, 53], with learners then debriefing in groups of more than one learner. Four studies incorporated simulated participants in the scenarios [42, 43, 45, 46]. All studies obtained ethical approval and were published after 2013.

Table 4 Overview and characteristics of included studies

Learner characteristics

In total, the 18 studies recruited 2459 learners. Of these, the majority were undergraduate students of varying professional backgrounds: 1814 nursing, 210 medical, 158 dental, 73 occupational therapy, and 39 physiotherapy students. Only 165 learners were postgraduate professionals: 129 doctors and 26 nurses. In all but four studies [15, 16, 49, 54], learners worked with their own professional group rather than as part of an interprofessional team.

Self-led debriefing format

The specific debriefing activities, whether SLDs, FLDs or a combination of both, took several different formats and lasted between 3 and 90 min. Most SLDs utilised a written framework or checklist to guide learners through the debriefing, although this was unclear in two studies [42, 44]. Two studies required learners to independently self-reflect, via a written task, prior to commencing group discussion [49, 50]. Some studies included video playback within their debriefings [15, 16, 42, 43, 49,50,51,52].

Data collection instruments and outcome measures

In total, 38 different data collection instruments were used across the 18 studies. These are listed along with their components and incorporated scales if described in sufficient detail within the primary study (Table 4). The validity and reliability of these instruments is variable. Indeed, 13 data collection instruments were developed by study authors without data on validity or reliability. Authors used one or more instruments to measure outcomes in five key domains (Table 5).

Table 5 Outcome measures

Key reported findings of studies

There was significant heterogeneity between the designs, aims, samples, SLD format, outcome measures and contexts of the 18 studies, with often conflicting and inherently biased findings due to study designs and outcome measures used. Nine studies reported equivalent outcomes regarding some elements of either debriefing quality, participant performance or competence, self-confidence or self-assessment of competence and participant satisfaction [15, 45,46,47,48,49, 52, 53, 56]. However, of these nine, five also reported that SLDs were significantly less effective if using other elements of the outcome measures [45, 46, 49, 52, 56]. In addition to these five, two studies reported decreased effectiveness of SLDs in comparison to FLDs or a combination of SLD + FLD [43, 50]. Conversely, only Lee et al. [52] and Oikawa et al. [48] reported any significant improvements with selected outcome measures with SLDs compared with FLDs, whilst Kündig et al. [47] reported improvements in two performance parameters with SLDs when compared with no debriefing.

Four studies investigated using a combination strategy of SLD + FLD and demonstrated either significantly improved or equivalent outcomes compared with either SLDs or FLDs only [49,50,51, 56]. Kang and Yu [51] reported significantly improved outcomes for problem-solving and debriefing satisfaction, but no differences in debriefing quality or team effectiveness. Other studies reported the opposite with significantly improved team effectiveness and debriefing quality, but no improvements in problem-solving or debriefing experience [49, 50]. Tutticci et al. [56] reported both significant and non-significant improvements in reflective thinking, dependent on which scoring tool was used. These findings, however, are in the context of variable quality appraisal scores (Table 3), wide variation in SLD formats and data collection instruments, and improved outcomes regardless of the method of debriefing used.

Thematic analysis results

We undertook reflexive thematic analysis (RTA) of the data set, revealing four themes and 11 subthemes (Fig. 2). The process of tabulating themes and an exemplar of coding strategy and theme development can be found in Additional files 3 and 4.

Fig. 2
figure 2

Thematic analysis map illustrating themes and subthemes

Theme 1: Promoting self-reflective practice

The analysis of data revealed that promoting self-reflective practice is the most fundamental component of how and why SLDs influence debriefing outcomes. Debriefings can encourage groups of learners to critically reflect on their shared simulated experiences leading to enhanced cognitive, social, behavioural and technical learning [15, 16, 42, 43, 45,46,47,48, 50, 51, 53, 54, 56, 57]. Different components within SLDs, including structured frameworks, video playback, and debriefing content, may influence such self-reflective practice. Most authors advocated a printed framework or checklist to help guide learners through the SLD process. However, despite this, SLDs were found to be less structured than FLDs [16]. The Gather-Analyse-Summarise framework [59] was most commonly used [46, 49,50,51, 53]. One study compared two locally developed debriefing instruments, the Team Assessment Scales (TAS) and Quick-TAS (Q-TAS), concluding that the Q-TAS was more effective in enabling the analysis of actions, but equivalent in all other measures [54].

Video playback offered a form of feedback for learners that encouraged reflective processing of scenarios [15, 16, 52]. One article concluded quoting a learner: ‘I learned it’s worthwhile to revisit situations like this. I know I won’t always have video to critique, but being able to rethink through the appointment will be helpful to review which tactics helped and which ones need to be revised’ ([42], p., 929). In such a manner, video playback enables learners to perceive behaviours of which they were previously unaware [15]. Whilst many studies lacked interrogation of content within SLDs, Boet et al. [16] provided an extensive analysis, reporting that interprofessional SLDs centred on content such as situational awareness, leadership, communication, roles, and responsibilities. Furthermore, it was through learners’ perceived performance of this content that offered entry points into reflection [16]. Some studies required learners to document their thoughts and impressions [44, 45, 47, 48, 50, 53]. However, the influence of content documentation on promoting self-reflective practice was inconclusive.

Combined SLD + FLD strategies involved learner and faculty co-debriefing [56], or SLDs preceding FLDs [49,50,51]. Using the Reflective Thinking Instrument one study reported FLD and combined SLD + FLD groups demonstrated significantly higher levels of reflective thinking amongst learners compared with SLD groups [56]. Within the limitations of a tool with poor validity and reliability, this study provides the best evidence that a combination approach to debriefing groups may be the most beneficial method for encouraging learner critical self-reflection. This finding is supported by results from three other studies showing improved outcomes with combined debriefing strategies, across team effectiveness [49], debriefing quality [50], problem-solving processes [51] and satisfaction with debriefing [50, 51].

Theme 2: Experience and background of learners

The experience and background of learners has a profound impact on how and why SLDs influence debriefing outcomes. Previous SBE experience may significantly impact the ability of learners to meaningfully engage with the SLD process and influences their expectations as to how a simulated scenario will progress [15, 16]. Furthermore, previous experience with FLDs may positively contribute to rich reflective discussion within SLDs as learners are better placed to integrate FLD goals and processes within a new context [16]. Whilst its influence on the conduct of SLDs is less clear, Boet et al. [16] note that real-world clinical experience allows learners to recontextualise their simulated experiences more readily and may therefore act as an entry point into the reflective process. In teams from the same professional background, learners appreciated the value of learning from constructive exchanges of opinion between colleagues operating at the same level [42, 44, 45], and role-modelling teamwork behaviours [48], whilst interprofessional SLDs may help break down traditional working silos, and support learning in contexts that replicate clinical practice [15]. Finally, learners originated mainly from either South Korea or North America. Cultural differences between Korean and Western learners may affect debriefing practices, with Korean students being described as less expressive than their Western colleagues [46]. The impact of cultural diversity on SLD methods, however, was not specifically investigated [44, 46, 53].

Theme 3: Challenges of conducting SLDs

Challenges of conducting SLDs were constructed from the dataset, including closing knowledge gaps, reinforcement of erroneous information, and resource allocation. The absence of expert facilitators may present a missed learning opportunity, whereby erroneous information could be discussed and consolidated, thus negatively affecting subsequent performance [44, 45, 47, 51] and potentially persisting into clinical practice [46]. There was consistent student preference for FLDs over SLDs which may indicate learners seeking expert reassurance and accurate debriefing content not readily available from peers [43, 50]. By reducing the requirement for expensive faculty presence, a significant motivating factor for investigating and employing SLDs is the potential for reducing costs [15, 16, 44,45,46, 49, 57]. However, SLDs do not appear to negate the need for faculty presence completely, but rather limit their role for specific elements within a SLE [15, 16]. Furthermore, the most influential impact on debriefing outcomes may be the incorporation of SLDs in combination with, rather than at the expense of, FLDs [49,50,51, 56]. Finally, most articles integrated a FLD-element within their SLE, thereby negating positive impacts on resource allocation [15, 16, 42,43,44,45,46, 49,50,51, 54,55,56,57].

Theme 4: Facilitation and leadership

The facilitation and leadership of SLDs may have a considerable impact on how and why SLDs influence debriefing outcomes. Only five articles described how learners were allocated as leaders and facilitators of SLDs [43, 54,55,56,57]. Random allocation of learners to lead and facilitate SLDs occurred either prior to, or on the day of the SLE [54,55,56]. In two studies, learners took turns leading the debrief such that all learners facilitated at least one SLD [43, 57]. No articles discussed the influence of leadership and facilitation on learners, nor the learners’ reactions, thoughts, or feelings towards the role or the content and reflective learning with subsequent debriefings. In two articles describing the same learner sample, only one of 17 interprofessional SLDs was nurse-led, all others being led by a medical professional [15, 16]. Such situations may have unintended implications by reinforcing stereotypes and hierarchical power imbalances.

Learners were trained to lead the SLDs in only two studies. In one, learners were randomly allocated to lead the SLDs, and were directed to online resources, including videos, checklists, and relevant articles, to help prepare for this role prior to the SLE [56]. No information concerning learners’ engagement with the resources was documented. In another study, learners were given 60 min training on providing constructive feedback to peers, which did not lead to improved outcomes for debriefing quality, performance, or self-confidence [43].

Discussion

The aim of this IR was to collate, synthesise and analyse the relevant literature to explore, with comparison to FLDs, how and why in-person SLDs influence debriefing outcomes for groups of learners in immersive SBE. The review identified 18 empirical studies with significant heterogeneity in respect to designs, contexts, learner characteristics, and data collection instruments. It is important to recognise that the review’s findings are limited by the variety and variability in quality of the data collection instruments and debriefing outcome measures used in these studies, as well as by some of the study designs themselves. Nevertheless, the findings of this review suggest that, across a range of debriefing outcomes, in situations where resources for FLDs are limited, SLDs can provide an alternative opportunity to safeguard effective learning. In some cultural and professional contexts, and for certain debriefing outcome measures, SLDs and FLDs may provide equivalent educational outcomes. Additionally, a small cohort of studies suggest that combined SLD + FLD strategies may be the optimal approach. Furthermore, SLDs influence debriefing outcomes most powerfully by promoting self-reflection amongst groups of learners.

Promoting self-reflection

Aligned with social constructivist theory [80], the social interaction of collaborative group learning in a reflective manner can lead to the construction, promotion and sharing of a wide ranging of interpersonal and team-based skills [81, 82]. Currently, there is a lack of evidence concerning which frameworks are best suited to maximise such reflection [10], especially in SLDs. Whilst framework use is associated with improvements in debriefing quality and subsequent performance, some evidence suggests that, in terms of promoting reflective practice, the specific framework itself is of less importance than the skills of the facilitator using it and the context in which it is applied [7, 9, 10]. In SLDs, there is no facilitator to guide this process, and as such, one may infer that the framework itself may have relatively more influence on debriefing outcomes and the reflective process of learners when compared with their use in FLDs. Conversely, whichever framework is used, the quality of the SLDs were rated highly, implying that it may be the structure provided by the framework, as opposed to the framework content, that is the critical factor for promoting reflection. Based on the findings of their qualitative study in which self-reflexivity, connectedness and social context informed learning within debriefings, Gum et al. [83] developed a reflective conceptual framework rooted in transformative learning theory [84], which purported to enable learners to engage in critical discourse and learning. By placing learners at the centre of their model, and by focusing on the three themes previously mentioned, this framework seems suited to groups of learners in SLDs. However, like many other debriefing frameworks, it remains untested in SLD contexts. In a study of business students, Eddy et al. [85] describe using an online tool that captured and analysed individual team members’ perceptions of an experience anonymously. The tool then prioritised reported themes to create a customised guide for teams to use in a subsequent in-person group SLD. The study reported that using this tool resulted in superior team processes and subsequent greater team performance when compared to SLDs using a generic debriefing guide only. Such tools may have a place in promoting self-reflection in healthcare SBE, such as with postgraduate learners with previous experiences of debriefings or those who have undertaken training in debriefing facilitation.

Furthermore, other structures or techniques that may help influence and promote self-reflection amongst groups of learners in SLDs are, as yet, untested in this context. For example, SLDs could take the form of in-person or online post-scenario reflective activities, in which learners work collaboratively on pre-determined tasks that align to ILOs. Examples such as escape room activities in SBE, in which learners work together to solve puzzles and complete tasks through gamified scenarios, have used concepts grounded in self-determination theory [86], with promising results in terms of improving self-reflection and learning outcomes [87, 88]. Meanwhile, individual virtual SLD interventions, rooted in Kolb’s experiential learning theory [89], have been tested and purport to enable critical reflection amongst users [90, 91]. Whilst such approaches may be relatively resource-intensive to create, they could be applied to SLDs for groups of learners in immersive SBE and prove resource-efficient once established.

Video playback

In both individual and group SLD exercises, video playback can allow learners to self-reflect, analyse performance, minimise hindsight bias, and identify mannerisms or interpersonal behaviours that may otherwise remain hidden [15, 42, 52, 92,93,94,95]. These findings are supported by situated learning theory whereby learning can be associated with repeated cycles of visualisation of, and engagement with, social interactions and interpersonal relationships which enable co-construction of knowledge amongst learners [96]. Conversely, in group SLD contexts, watching video playback may have unintended consequences for psychological safety, making learners feel self-conscious and anxious, and impact negatively on their ability to meaningfully engage with reflective learning [93]. A systematic review concluded that the benefits of video playback are highly dependent on the skill of the facilitator rather than the video playback itself [95], and as such its role influencing debriefing outcomes in SLDs remains uncertain.

Combining self-led and facilitator-led debriefings

The findings of this review suggest that employing combinations of SLDs and FLDs may optimise participant learning [49,50,51, 56], whilst acknowledging that this may also be dependent on other variables such as the expertise of debriefers and contexts within which debriefings occur. Whilst the reported improved outcomes are situated in the context of in-person SLDs for groups of learners, they are supported by the wider literature. For example, a Canadian research group investigated combined in-person and virtual individual SLD formats with FLDs, reporting improved debriefing outcomes across multiple domains including knowledge gains, self-efficacy, maximising reflection, and debriefing experience [90, 97,98,99]. SLD components of the combined strategy enable learners to reflect, build confidence, identify knowledge gaps, collect, and organise their thoughts and prepare for group interaction prior to a FLD [90, 97,98,99]. However, limitations of these studies include the unreliability of outcome measures.

Facilitation and leadership

Only two studies provided training for learners in how to facilitate debriefings and provide constructive feedback [43, 56]. This is surprising given the emphasis of faculty development in the SBE literature [6, 9, 28, 100]. RTA of the data highlighted how the potential influence of previous experience with FLDs may influence learners’ ability to actively engage in the reflective nature of the SLD process [15, 16]. This brings into question whether learners should have some familiarity of debriefing processes, either via previous experience or targeted training, prior to facilitating group SLDs.

Variables such as learners’ debriefing experiences and educational context have implications for the interpretation of the findings of this review. Having previous experience with FLDs may potentially influence learners’ abilities to actively engage in the reflective nature of the SLD process [15, 16] bringing into question whether learners should have some familiarity of debriefing processes, either via prior experience or targeted training, prior to being expected to facilitate or lead a group SLD. This further raises questions about whether SLDs may or may not be more suitable for certain populations, such as students undergoing early training or postgraduates who are relatively more experienced in SBE. Training peers as facilitators, who then act in an ‘instructor’ role, rather than as part of the learner group, has also been reported as an effective method to positively influence debriefing outcomes [101, 102]. However, training learners to facilitate SLDs involves significant resource commitments, thus negating some of the initial reasons for instigating SLDs.

Data collection instruments and outcome measures

The studies included in this review used multiple data collection tools to gauge the influence of SLDs on debriefing outcomes across five domains (Table 5). The diversity in approaches to outcome measurement is problematic as it impedes the ability to compare studies fairly, effectively, and robustly [103]. Certain instruments, such as the Debriefing Assessment for Simulation in Healthcare- Student Version [68] and the Debriefing Experience Scale [69], are validated and reliable tools for assessing learner perceptions of, and feelings towards, debriefing quality in certain contexts. However, learner perceptions of debriefing quality do not necessarily translate to objective evaluation of debriefing practices. Additionally, some studies relied on learner self-confidence and self-reported assessment questionnaires for their outcome measures, despite self-perceived competence and confidence being a poor surrogate marker for clinical competence [104]. Commonly used tools measuring debriefing quality may not be suitable for SLDs and having a ‘one-size-fits-all’ approach could invalidate results [105]. To our knowledge, there is no validated or reliable tool currently available that specifically assesses the debriefing quality of SLDs.

Psychological safety

One important challenge of conducting SLDs, which was not constructed through the RTA of this dataset, is ensuring psychological safety of learners in debriefings. Psychological safety is defined as ‘a shared belief held by members of a team that the team is safe for interpersonal risk taking’ ([106], p., 350) and its establishment, maintenance and restoration in debriefings is of paramount importance for learners participating in SBE [107, 108]. Oikawa et al. [48] stated that ‘self-debriefing may augment reflection through the establishment of an inherently safe environment’ ([48], p., 130), although how safe environments are ‘inherent’ within SLDs is unclear. Tutticci et al. [56] quote secondary sources [83, 109] stating that peer groups can improve collegial relationships and engender safe learning environments that improve empathy whilst reducing the risk of judgement. Conversely, it may also transpire that psychologically unsafe environments are fostered, leading to unintended harmful practices. In interprofessional contexts where historical power imbalances, hierarchies and professional divisions can exist [11, 110, 111], and in which facilitator skill has been the most frequently cited enabler of psychological safety [112], one can infer that threats to psychological safety may be accentuated in SLDs.

In contrast, researchers found the process of engaging in an individual SLD enhanced psychological safety by helping learners decrease their stress and anxiety, thus leading to more active engagement and meaningful dialogue in subsequent FLDs [99]. Another study reported learners describing the familiarity of connecting with known peers within SLDs fostered psychological safety and enabled learning [98]. However, these studies were excluded from this review due to having individual rather than group SLDs. Nevertheless, their findings that combined SLD + FLD strategies enable psychological safety may partially explain the findings of this review, and psychological safety may therefore be a central concept in understanding how and why SLDs influence debriefing outcomes.

For teams regularly working together in clinical contexts, their antecedent psychological safety has a major influence on any SLEs they undertake [113]. This subsequently impacts on how team members, both individually and collectively, experience psychological safety within their real clinical environment [113]. The place of SLDs in such contexts, along with their potential advantages and risks, remains undetermined.

Limitations

This review specifically investigates in-person group debriefings, and therefore, the results may not be applicable to individual or virtual SLD contexts. The inclusion criteria allowed for published peer-reviewed empirical research studies in English, excluding grey literature. This may introduce bias with some evidence suggesting that excluding grey literature can lead to over-exaggerated conclusions [114, 115], and concerns regarding publishing bias [116]. We also acknowledge that the choices made in constructing and implementing our search strategy (Additional file 1) may have impacted the total number of articles identified for inclusion in this review. Finally, the heterogeneity of the included studies limits the certainty with which generalisable conclusions can be made. Conversely, heterogeneity enables a diverse body of evidence to be analysed and better informs the need for future research and where gaps may lie.

Recommendations for future research

The findings of this review have highlighted several areas requiring further research. Firstly, the role of combining group SLDs with FLDs should be explored, both quantitatively and qualitatively, to explain its place within immersive SBE. Secondly, to inform best practice, different methods, structures and frameworks of group SLDs need investigating to assess what may work, for whom and in which context. This extends to further research investigating different groups, such as interprofessional learners, to ascertain if certain contexts are more suitable for SLDs than others. Such work may feed into the production of guidelines to help standardise SLD practices across these differing contexts. Thirdly, assessment and testing of data collection instruments is required, as current tools are not fit for purpose. Clarification of what is suitable and measurable in terms of debriefing quality and learning outcomes, especially in relation to group SLDs, is needed. Finally, whilst research into fostering psychological safety in FLDs is emerging, the same is not true in the context of SLDs and this needs to be explored to ensure that SLDs are not psychologically harmful for learners.

Conclusions

To our knowledge this is the first review to explore how and why in-person SLDs influence debriefing outcomes for groups of learners in immersive SBE. The findings address an important gap in the literature and have significant implications for simulation-based educators involved with group debriefings across a variety of contexts. The synthesised findings of this review suggest that, across a range of debriefing outcome measures, in-person SLDs for groups of learners following immersive SBE are preferable to conducting no debriefing at all. In certain cultural and professional contexts, such as postgraduate learners and those with previous debriefing experience, SLDs can support effective learning and may provide equivalent educational outcomes to FLDs or SLD + FLD combination strategies. Furthermore, there is some evidence to suggest that SLD + FLD combination approaches may optimise participant learning, with this approach warranting further research.

Under certain conditions and circumstances, SLDs can enable learners to achieve suitable levels of critical self-reflection and learning. Similar to FLDs, promoting self-reflective practice within groups of learners is the fundamental method of how and why SLDs influence debriefing outcomes because it is through this metacognitive skill that effective learning and behavioural consolidation or change can occur. However, more work is required to ascertain for whom and in what contexts SLDs may be most appropriate. In situations where resources for FLDs are limited, SLDs may provide an alternative opportunity to enable effective learning. However, their true value within the scope of immersive SBE may lie as an adjunctive method alongside FLDs.