Background

Investment in leadership development in healthcare is substantial. Reports indicate an estimated annual spend on leadership development in the USA at $50 billion (USD) [1, 2]. In 2019, the National Health Service (NHS) in England invested £2 million to help boost leadership development [3] and made individual leadership and capability development fundamental to delivery of the NHS Long-Term Plan [4]. Over the last three decades, we have seen a growing trend in health systems around the world placing great emphasis and resource on improving ‘clinical leadership’ [5]. Traditionally, clinical leadership encompassed leadership delivered by doctors and nurses [6]. More recently, clinical leadership has expanded to comprise anyone trained to deliver frontline care [6].

In the context of surgery, the surgical profession has increasingly recognised the need for high quality leadership both in and outside of the operating theatre [7, 8]. For example, the Royal College of Surgeons of England recommend that consultant surgeons have the responsibility to develop an effective team through leadership and teambuilding [7]. The necessity for surgeons and surgical teams to lead, inspire, and manage a team to meet the needs of patients, however, is not a substantial, or evidence-based component of the surgical curriculum.

In the UK, the Department for Health and Social Care (DHSC) recently called for more inclusive leadership in the healthcare professions, which aims to adopt a collaborative approach to leadership practice [9]. However, evidence for endorsing this approach was lacking. It was not clear how inclusive collaborative leadership could or should be achieved in practice. Within healthcare literature, the focus tends towards leadership development which advances the leadership skills of individuals–not collective teams [10]. This skills-based approach, promoted by Mumford and colleagues (2007), categorises leadership into types of skills, for example, cognitive, interpersonal, business, and strategic skills [10]. NHS England, alternatively, describe leadership development as confidence building, understanding practical levers, widening perspectives, and talent management [6]. Whilst these are more generic terms, they are still person centric. Nonetheless, these individualist approaches target the ability and motivation of surgeons to improve their own leadership. Little attention is given to the multidisciplinary teams, organisation, and environmental contexts in which leadership plays out in. Skills-based leadership, therefore, fails to account for the contextual opportunities which enable leadership to be enacted in the inherently social conditions of the healthcare sector.

In contrast, Ability, Motivation, and Opportunity (AMO) theory describes how the interplay between ability, motivation, and opportunity (of a person, a team, or a department) to enact leadership, gives us a measure of an leadership performance and performance-related outcomes [11]. This enables us to embrace leadership as a distributed and collective process [12]. In the distributed leadership literature, leadership becomes a shared process across a collective group, where people have common organisational perspectives, goals, and shared actions [12]. Framing leadership in this way allows us to move away from traditional heroic leadership tropes, to recognise the contribution that groups of people, such as a surgical team, make to leadership processes and practices [13].

Despite the significant investment in healthcare leadership development, and the numerous systematic reviews which have been conducted to determine the effectiveness of specific leadership interventions (such as team-training and co-leadership) [14,15,16,17,18,19], there is no agreement on what surgical leadership is, what leadership capabilities are, or how we can ensure they are developed and implemented effectively. Most important for the healthcare sector is that leadership development is viewed as a workforce intervention, and if funded by public money, should be underpinned by a rigorous evidence-base. Therefore, it is vital that workforce interventions, aimed to improve leadership processes, are appropriate and able to achieve effective outcomes. Not only to justify the significant expenditure, but to ensure that advanced leadership can improve the quality of patient care. This is a challenge in many areas of healthcare delivery, including surgery [20].

We aimed to fill this gap by conducting a realist review of interventions and strategies to promote evidence-based leadership in healthcare. The goal of our review, is to develop a programme theory to answer the following question: In which context and for whom, can interventions and strategies improve the leadership of surgical trainees, surgeons, and surgical teams and why?

Methods

Realist review approach

Realist reviews are a theory-based approach to synthesising existing evidence. This review follows the Realist and Meta-narrative Evidence Syntheses: Evolving Standards (RAMESES) quality and reporting standards [21] which involves a process of focusing the review, developing programme theory, developing a search strategy, selection, and appraisal of documents, and applying realist principles to the analysis of data. We have provided a flow diagram which details the review process, and how the RAMESES standards were followed (see Additional file 1) [21].

According to realist philosophy, interventions that are context-dependent, and those that are successful in certain contexts but not in others, can be described as complex. Leadership development is an inherently complex intervention [22], realist review methods enable us to unpack the ‘black box’ of leadership. Realist reviews aim to develop a programme theory, which is grounded in existing literature, that seeks to explain why and how complex interventions work. They identify the underlying mechanisms (the hidden actions) that are triggered in certain contexts, which lead to specific outcomes [23]. These causal chains are referred to as Context-Mechanism-Outcome configurations (CMOCs). In our review, we have adopted definitions of context, mechanisms, and outcomes as used by Wong and colleagues [21]. They describe context as the “backdrop of programs and research” and the condition that “triggers and/or modifies the behavior of a mechanism” [24]. A mechanism then, is the agent of change, the “underlying entities, processes, or structures which operate in particular contexts to generate outcomes of interest” [25]. Because mechanisms are underlying, they can be difficult to identify in the literature. Finally, the outcome is the entity that changes as a consequence of the context, triggering the mechanism.

Focusing the review

The review scope was developed iteratively through a scoping search of the literature, multiple expert stakeholder consultations and discussion between the research team. Experts included orthopaedic surgeons at various career stages and surgical trainees, an orthopaedic Training Programme Director, members of a orthopaedic leadership Action Learning Set and academics with expertise in clinical leadership. We developed the study protocol which was registered with PROSPERO CRD42021230709 and published [26]. To inform our initial programme theory, we searched for theoretical papers on surgical leadership. Our information specialist (RC) designed and conducted a systematic search in five databases not limited by date.

The theory search identified 8382 articles. Two authors independently screened the titles and abstracts identified in one database search (MEDLINE n = 4012). Included papers were obtained at full text and read for relevance to the review. Potentially relevant articles were summarised and discussed with the wider project team. During this process it became apparent that the theoretical papers were unhelpful in progressing our initial programme theory. Many articles appeared generic, describing the importance of ‘good’ leadership at different levels (i.e. macro, meso, micro) of healthcare, but not specifically how leadership should be conceptualised, or which component parts could form the basis of future interventional research. Following discussion with experts we therefore, ceased this theoretical scoping activity and focused our resources on the identification of empirical studies, which we considered more informative for narrowing the scope of our initial programme theory.

Developing programme theory

Following scoping, our initial programme theory included leadership development outcomes for individual surgeon leadership, patient outcomes and organisational outcomes [26]. After discussions with expert stakeholders and the research team, we narrowed the scope to focus only on outcomes for surgeons, surgical teams, and trainees. That is those professional groups or communities of individuals who deliver surgical services. Our initial programme theory depicted organisational and patient outcomes as distal outcomes. These were the outcomes which may (or may not) develop as a result of improvements to the professional group—i.e. the proximal outcome of the leadership intervention (see Additional file 1). However, this large scope generated an intractable volume of literature and diluted the causal links between the intervention and intended outcome. A narrowed scope on proximal outcomes enabled us to meaningfully categorise the interventions and strategies used to promote evidence-based leadership in healthcare.

Developing a search strategy

A systematic search strategy for empirical studies was developed in MEDLINE (Ovid) by our information specialist (RC). The search was conducted in July 2021 and adapted to a variety of bibliographic databases relevant to the scope of the review (MEDLINE (Ovid), Embase (Ovid), PsycINFO (Ovid), Cochrane Library (Wiley), HMIC (Ovid), Abi/INFORM Global (Proquest).

A range of relevant search terms were included, combining the concepts of leadership interventions with surgeon/surgical team. The empirical search was limited to literature published in English after 2014, the year in which the ‘Surgical Leadership: A guide to best practice’ guidance was first published by the Royal College of Surgeons of England (an updated 2018 version has since been published) [8]. An example of the search strategy in MEDLINE (Ovid) is provided (see Additional file 1). The references of all included documents and relevant reviews (systematic and narrative) were dual screened to identify further relevant documents for consideration.

Selection and appraisal of documents

Studies were screened at title and abstract stage against the inclusion criteria listed in Table 1.

Table 1 Inclusion criteria

Two reviewer pairs independently screened all titles and abstracts identified through the empirical search (AW + MS, AG + MH). Full text articles were obtained and screened independently by two reviewers (MH + AW). Full-text articles were screened against the inclusion criteria and with consideration to their relevance [23, 27].

Disagreements were resolved through discussion including a third reviewer (AG or JG). Two reviewer pairs (AW + MH) independently screened the reference lists of identified reviews and all included studies to identify relevant articles.

Applying realist principles to the analysis

Data extraction

A data extraction template was piloted (JG) by the research team members (AG, AW, MH) and minor adaptations were made. Data was extracted by one research team member (MH or AW) and checked by one research team members (JG). We abstracted all data from the study that might be relevant to the research question into one document per study for review by the team [28]. Extracted data could include descriptive information such as geography and participants, but also data which could inform CMOCs or fragments of CMOCs. Data extracted are listed in Table 2.

Table 2 Data extracted from studies

Appraisal of the evidence

Relevance for purpose was the most important factor in determining relevance for inclusion in our review and articles were not be excluded based on their quality [27]. Nevertheless, since understanding of rigour is relevant for our synthesis and for understanding the strength of our findings [21], we used the mixed methods appraisal tool (MMAT) to review the quality of included studies [29]. All included articles were assessed by one reviewer (AW or MH) and 25% of studies were checked by a second reviewer (JG). Disagreements were resolved by a third reviewer (JG). We selected the MMAT as it can be used for all study designs [29]. We grouped articles into low, medium, or high quality [30, 31].

Data synthesis

All studies relevant to the individual surgeon leadership were grouped into the four skill categories suggested by Mumford et al. [10]. For example, interventions which included one to one mentorship or coaching were grouped as interpersonal skills.

Discussion between the research team and expert stakeholders generated additional categories of leadership intervention where the focus was broader, for example team-based simulations. Next, the first author (JG) identified CMOCs and CMO fragments (e.g., Context and Outcome; Context and Mechanism) which were copied into a separate word document for each type of leadership intervention (mentoring, coaching, simulation training, leadership course, feedback intervention, and debriefing) for discussion with the wider team. An excerpt from each article was selected as supporting evidence of the CMOC or CMO fragment. Next, we reviewed, compared, and contrasted all CMOCs and CMO fragments and tried to identify patterns and causal relationships between them.

We synthesised CMOCs and CMO fragments iteratively through verbal discussion, reading and commenting on each other’s configurations. We documented the CMOCs and CMO fragments in Word documents and in a programme theory diagram. We refined the diagram through discussion with the research team members. In several CMOCs, a mechanism or mechanisms were not immediately obvious, and not directly referred to in the literature. In these cases, the review team would suggest mechanisms that offered a potential ‘fit’ with the data [28]. We took a pragmatic approach and consulted expert stakeholders regularly to review and refine our CMOCs and programme theory and fill any gaps in our theory [28]. Where parts of a CMOC or CMO fragment were not grounded in the literature or expert stakeholders’ experience, this is clearly highlighted in our findings to enhance transparency.

Stakeholder consultations

A group of expert stakeholders was identified through the established network of the research team members and included two senior academics, four consultant surgeons and one surgical trainee. Stakeholders’ feedback on the consistency and plausibility of CMOCs and the programme theory was discussed in virtual meetings and incorporated into the final programme theory.

Results

Thirty-three articles were included in our review [32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64] (see Fig. 1). An overview of the characteristics of included studies is provided in Table 3. Studies included in our realist review were judged on their rigour, i.e. whether we considered the reported method used to generate the piece of data credible and/or trustworthy [21]. The overall quality of included studies varied: 21 studies were rated as high, eight studies as medium and four as low quality (see Table 3).

Fig. 1
figure 1

Overview of search results

Table 3 Characteristics of included studies

CMOCs and programme theory

Individual leadership skills influenced by interventions were grouped into four categories (see Table 4) and Table 5 provides an overview of the 19 CMOCs we identified through our analysis of all the included articles. Our final programme theory, which encompasses all CMOCs, is provided in Fig. 2. Across the 19 CMOCs, the outcome is the same: improved leadership by an individual as defined in the study (see Table 4 for definitions by study), however context and mechanisms differ. The CMOCs have been grouped into three core areas which improve leadership, those which focus on (1) feedback and how feedback is delivered to those partaking in leadership development, (2) the characteristics of the person or people undergoing leadership development, and finally (3) atmosphere, which represents the physical and psychological environment in which leadership development takes place. We now describe each of our 19 CMOCs in turn, with examples from the evidence provided.

Fig. 2
figure 2

Final programme theory. The Foundational Model of Surgical Leadership Improvement

Table 4 Outcomes of leadership interventions grouped into cognitive skills, interpersonal skills, business skills, and strategic skills
Table 5 Context-mechanism-outcome configurations with example quotes

CMO1-2: timeliness of feedback

The timeliness of feedback was found to be an important contextual feature, which leads to the improvement of leadership. For example, Somasundram et al. (2018) showed that immediate critique from consultants after scenario simulations (referred to as ‘freeze-frames’) were effective for improving participants’ leadership learning [60]. This was echoed by Vu et al., who found that delayed feedback was perceived as limited in its usefulness to residents changing their leadership behaviour [63]. Stakeholders suggested that timely feedback is required because it makes feedback feel relevant and focused, as it is fresh in the memory of the learner. Additionally, a study on debriefing suggested that if feedback on identified problems was provided in a timely manner, surgeons’ faith in the interventions increases, as participants feel satisfied seeing their problems recognised and prioritised [34]. This additional mechanism resonated highly with our stakeholders.

CMO3: reoccurrence of feedback

Evidence suggests that for feedback to improve leadership, it needs to be provided more than once [42, 43, 59, 63]. For example, studies on feedback interventions identified that follow-up feedback should be provided in the form of a survey within 3–6 months [42]. In support of this, Vu et al. (2020) stated that only frequent feedback can lead to behaviour change [63]. A study on mentoring found that those who had weekly, or monthly mentor meetings were most satisfied with their mentoring arrangements [43]. Since mentors provide feedback we felt that this was supporting the other studies. According to Gregory et al. repeated feedback reinforces leadership improvement overtime [42]. After deliberation with stakeholders, “reinforcement of learning” was agreed as the mechanism.

CMO4: feedback delivery by a trusted, respected person

Several studies suggest that feedback delivery through a trusted, respected person is an important context for improving the leadership of surgeons [42, 57, 63]. Several studies mentioned the need for an objective person to deliver feedback [57]. or a trained mentor or coach as a suitable and preferred person to provide feedback to surgeons [42, 63]. However, when discussing this with our stakeholders, they concluded that the most important aspect of a person delivering feedback is that you trust and respect them, as this reciprocal relationship makes you want to improve and maintain that person’s trust and respect. We then noted that excerpts revealed that coaches need to be “more experienced or reputable” (Pradarelly, 2016), indicating that trust and respect are key when delivering feedback.

CMO5: feedback from a range of trusted and respected people

We found that obtaining feedback from a range of people, for example from junior residents, advanced practitioners and nurses [63], is crucial in improving leadership and important in the views of surgeons [42, 63]. Gregory et al. state that feedback from a range of people ensures both comprehensive and diverse feedback. When exploring studies which investigated how mentoring can advance leadership, we found that having multiple mentors appears important in improving leadership for surgeons [36, 49]. However, we found no explicit explanation in the mentorship literature as to why multiple mentors is effective. Stakeholders agreed that obtaining feedback from a range of people can be helpful as it provides a broader picture of yourself. However, stakeholders emphasised that feedback is only helpful in improving leadership if it comes from trusted and respected people. We adapted the CMO to reflect stakeholders’ considerations.

CMO6: delivery of anonymous feedback from juniors to seniors

Gregory et al. [42] found that anonymous feedback helped improve leadership as it allowed surgeons to focus more on feedback content, rather than its source. Stakeholders felt that anonymous feedback had a role to play but only in the context of juniors providing feedback to seniors. Stakeholders felt that this would allow juniors to provide honest feedback as they feel safe to speak their mind. We have specified the context and mechanisms accordingly.

CMO7: delivery of direct feedback from a peer or someone more senior

As outlined in the previous CMO, stakeholders felt that feedback was most effective in improving leadership if it was provided directly (rather than anonymously) from a peer or someone more senior. They reasoned that if feedback is provided from someone at your level or above, (i.e. consultant to consultant) you would want to maintain their trust and working relationship and therefore, improve your leadership skills.

CMO8-9: openness to self-improvement

Our stakeholders suggested that openness to self-improvement is an important mechanism in several contexts. In line with this, two studies indicated that negative feedback is delivered best in a private context [37, 53]. Essentially, surgeons and trainees feel that it is important that they are not challenged or humiliated in front of their peers or colleagues. The private context seems to increase an openness to self-improvement via leadership. In contrast, those who were criticised in front of peers, for example of their handling of surgical cases/at a trauma meeting, were less willing to take on similar cases again in the future or speak openly in meetings because of how it made them feel. In a study on mentoring, we discovered peer-to-peer approaches were more likely to positively impact leadership development [41]. We found that peer-to-peer communication tended to use non-hierarchical language which may have facilitated positive reciprocal reactions to leadership development and a sense of openness between participants. Our stakeholders felt as though the same mechanism (feeling open to and recognising the importance of self-improvement) may be at work here.

CMO10: awareness for the need for leadership skills

The timing in professional career was an important context for improving leadership. Jaffe et al. [47] describe that leadership training was more effective at point of transition, where it was becoming necessary for surgeons to take on leadership roles to continue to progress, for example when surgeons were moving into a surgeon consultant or surgical director role. Stakeholders confirmed that those who are aware that they need leadership capabilities may feel and be more motivated to improve as the intervention is perceived as more relevant for them. This was particularly poignant when surgeons felt that they lagged behind their peers in this regard.

CMO11: having confidence in technical skills

Evidence suggests that leadership interventions are more effective in improving leadership of those with more confidence in their technical skills [47, 54]. Surgeons with more confidence in their technical surgical skills were able to focus on their leadership abilities in simulation training. Those with less confidence in their technical skills were facing the dual challenge of focusing on both technical surgical skills and leadership skill development [54]. The timing of leadership development appears relevant to effectiveness, with those surgeons with more confidence in their technical skills perhaps being more likely to benefit from leadership interventions. Stakeholders agreed with this CMO.

CMO12: having identified leadership deficits

We found evidence to suggest that those who were identified as having leadership deficits, via feedback interventions, demonstrated more improvement in leadership compared to their competent colleagues [42, 45]. Stakeholders suggested that those with identified deficits have more room to improvement and may be more motivated to improve, again elements of peer comparison were mentioned as important.

CMO13: a variety of interactive learning activities

Studies evaluating leadership courses found that variation in the component leadership learning activities was important for improving leadership [44, 47, 57, 62]. Vitous et al. indicate that broad learning activities, such as team building, business acumen, and self-awareness, expand surgeons’ perspectives. Learning activities included business school principles, leadership in the healthcare context, self-empowerment, and economic forces such as understanding financial statements [47, 57, 62]. We found that active reading, reflection and discussion appeared to be learning activities which improved leadership development specifically [44]. Stakeholders agreed that a variety of learning activities are important but stressed that they needed to be interactive to engage participants.

CMO14: implementation of speak-up culture

Brindle (2018) found that if all members of a surgical team were allowed to speak-up, about errors for example, this led to an improvement in communication and improved sense of collective leadership [34]. In support of this, Jayasuriya-Illensghe et al. (2016) found that if junior surgeons and nurses are not encouraged to speak up this leads to communication breakdown between surgical teams [48]. Stakeholders agreed that a speak-up culture was a highly important context, whether that be speaking up about unacceptable behaviour style or technical errors. Stakeholders felt that the mechanism at work was “feeling equally valued and a sense of engagement”. Therefore, highlights the importance of considering the organisational culture in which leadership development takes place.

CMO15: customisation to surgeons’ needs

Evidence from the literature suggests that leadership interventions are effective in improving surgeons’ leadership skills if they are customised to individual surgeons’ needs [53, 57, 59]. For example, studies showed that mentoring and coaching were more effective where surgeons were able to self-select their mentor [53, 59]. A study of a leadership development programme found that intervention effectiveness was dependent on whether the content was personalised to participants and considered their feedback [57]. Mutabzic et al. indicated that ‘sense of control’ over leadership development was the reason why customisation was deemed important in the design of leadership interventions [53]. However, our stakeholders felt that it was less about a ‘sense of control’ but more about the sense of relevance if interventions are customised, and surgeons or surgical trainees had a say in choosing what they felt was most important to them. We therefore adapted this CMO mechanism.

CMO16-17: safe learning environment

Evidence suggests that interventions improve leadership if they occur in a more intimate learning environment, meaning interventions delivered in person and in small groups or one-to-one. For example, mentoring studies showed that surgeons preferred one-on-one and face-to-face meetings, rather than larger group sessions [39, 59]. Similarly, studies of leadership courses and simulation training indicated that small group learning was preferred by participants [44, 60]. According to Hill et al. intimate learning environments increase participants’ willingness to share personal examples, which may encourage and reinforce their learning as they are actively engaging in the subject matter [44]. Our stakeholders reflected that these environments create a sense of ‘safe space’ where surgeons can speak openly to colleagues. Stakeholders also felt it allows participants to apply the learning to their personal context. We felt that both mechanisms were plausible and recognised both.

CMO18: training in surgical teams

Stakeholders stated that it would be important for surgical teams to be given time to train in leadership together and to run through operations together to reinforce learning in practice (the opportunity required in AMO theory). Our stakeholders stressed that leadership is a process and only through training together could you build trust, rapport, friendship, and mutual respect leading to surgical team leadership.

CMO19: genuine investment in the intervention

The concept of ‘genuine investment’ was important for leadership development. We found that if surgeons deem an intervention important in context, and delivered for ‘the right reasons’, then it was more likely to be successful in impacting leadership [34, 36]. For example, mentors who were perceived to be unselfish, and who did not show any tangible benefits from offering mentoring, appeared to positively impact mentee’s leadership development [36]. We found when executive staff were present in the operating room, in a supportive capacity, this signalled to the surgical team members a genuine investment in the leadership intervention [34]. Ramjeewon et al. (2020) affirmed that genuine investment in terms of provision of a realistic setting in simulation studies led to improvement in leadership [58]. This rang true with our stakeholders who concluded that the genuine investment triggered a sense of faith and engagement in the intervention, and in the people delivering it. This led to increased commitment in the programme, and ultimately improvement in leadership.

Discussion

Realist review methods were used to review the literature describing interventions and strategies which aim to promote evidence-based leadership in healthcare. We aimed to generate a programme theory to explain in which context and for whom surgical leadership interventions work and why. Thirty-three studies and seven stakeholders contributed towards the development of our programme theory which consists of 19 CMOCs. Our findings suggest that surgical leadership interventions improve leadership when feedback is delivered in a timely manner, on multiple occasions and by a range of trusted and respected people. With regard to negative or more developmental feedback, we identified that it is best provided privately. Feedback from seniors to juniors or peers should be delivered directly, whereas feedback from juniors to seniors should be provided anonymously.

While numerous systematic reviews have described the effectiveness of individual leadership interventions in healthcare settings [14,15,16,17,18, 65], we are only aware of one realist review which aimed to understand and explore context and mechanisms; however, this included all medical and surgical specialties [15]. De Brún’s and Auliffe’s review found that training in teams is important for leadership intervention success, suggesting that [it engenders] “the development of a shared understanding and appreciation of skills of others” [15]. A mechanism we further conceptualise narrowly as trust, rapport, friendship, and mutual respect which we also found enables leadership to flourish. De Brún described “open and inclusive communication,” as important [15]. We identified this as a speak-up culture. However, our analysis and consultation with stakeholders suggest that this speak-up culture triggers “feeling equally valued and a sense of engagement” which creates an opportunity for effective leadership to develop [15].

As we anticipated, most of the leadership interventions identified in our review targeted the leadership development of individual surgeons. Many studies in our review report surgeons learning leadership skills through standalone interventions, for example in the studies by Pradarelli et al. and Ramjeeawon et al., but there was very little evidence to describe how and whether this skill-based approach to learning extends into clinical practice [57, 58]. Training surgeons in leadership will only get us so far in making improvements. It is comparable to learning surgery using a textbook, but not letting surgical trainees into the operating theatre to practice and hone their skills in real environments, interacting with team members seniors and patients.

This individualised focus reflects the proliferation of clinical leadership programmes and courses targeted at medical professionals across the globe [3,4,5,6]. For example, many papers describe leadership development via nontechnical skills training. These high profile 1-day courses aim to optimise and enhance the performance of individual surgeons [29], yet the evidence for their effectiveness is limited to attendees or peers self-reports of changes in leadership skills [44, 57]. For example, authors asked surgeons ‘do you believe you are a better leader’ and surgeons often replied positively. This provides little sense of the surgeons’ actual capacity and capacity to enact effective leadership. Where leadership improvement was measured objectively, evidence of improvement was captured using tools such as 360° feedback reports, or surgery specific scales including the Nontechnical Skills for Surgeons (NOTSS) and Oxford Nontechnical Skills (NOTECHS) assessment or the Team Strategies and Tools to Enhance Performance and Patient Safety (TeamSTEPPS) scoring. Most studies included in our review did not have a longitudinal design or include multiple sources of evaluation for example, multi-method case study. Hence, it is not clear what impact or for how long any impact of leadership interventions is sustained, let alone whether it can be scaled.

Some might argue that more rigorous approaches to the measurement and evaluation of leadership interventions are required. Firstly, to overcome the apparent responder bias in the literature and secondly, to move a step closer to determining whether the large investment the healthcare sector makes in developing leaders is delivering a return. Yet, we recognise that studies which have sought to establish links between leadership and performance have long been criticised as circumstantial or an anecdotal [66]. Indeed, the causal link have been characterised as an 'act of faith' rather than an empirically proven fact [67]. In part, this is due to a poor conception of what leadership is, and what leadership is not, in public services such as healthcare [67, 68]. There are also methodological problems, such as compounding variables (increased funding for the NHS during the Blair Labour Government resulted in an improved performance effect in the NHS not necessarily attributable to the influence of leadership), and the effect of time lag between leadership actions and their effect that is difficult to discern [69]. Hence, we might best consider ‘effective’ leadership as that aligning with ideal type skill-based models as set out in literature, such as a transformational variant, but one that encompasses an individualistic and distributed configuration of leadership influence, rather than one that focuses upon a ‘heroic’ individual [70].

Nevertheless, the interventions and strategies shown to be most effective in our review include those which aim to raise awareness of the importance of leadership for surgical practice, those that attract people with established confidence in their technical surgical skills, and interventions aimed at surgeons who have been labelled with leadership deficits. Our findings suggest that perceived openness to leadership development, whether that be because surgeons’ have identified deficits or because their technical surgical skills are becoming more innate, was motivational for leadership development. Therefore, leadership interventions can build abilities and capabilities and give surgeons time to focus on their leadership, and an opportunity to practice leadership in the context of clinical practice.

The timing of leadership interventions in surgical careers seems to be a key aspect to their effectiveness. Traditionally, the literature on the timing for the implementation of innovations and behavioural change interventions has focused on discrete events such as triggers to Acton [71, 72]. However, the concept of timing in our review reflected timing relative to a surgical career trajectory, i.e. when mastery of basic and or advanced technical surgical skills have been achieved in earlier years and surgeons have mental capacity to develop in other areas. Our expert stakeholders confirmed that the demands of surgical training in the early stages means that surgeons and surgical teams often have no capacity for non-technical developments. One surgeon described that when training juniors in theatre, patient safety is a first, and that leadership development is not a priority. This highlights specifically the need to practice leadership development in the context of surgery, not in classrooms. The evidence suggests that interventions delivered at key transition points in surgical careers may be more beneficial to surgeons and teams when they have the capacity to undertake additional learning and development.

We found that for interventions to improve leadership in surgery, they need to be delivered in contexts where there is an intimate learning environment within a speak up culture, provide a variety of interactive learning activities, show a genuine investment in the intervention, and be customised to surgeons’ needs. Therefore, leadership of surgical teams may be best developed by allowing mixed discipline surgical teams to train together, rather than training distinct professional groups in isolation, which was the case in most of the literature we reviewed. This was reinforced by expert stakeholders who emphasised that healthcare leadership is rarely enacted in isolation or in distinct professional groups.

Strengths and limitations

This review represents the first use of realist review methodology to explore how surgical leadership interventions need to be designed to improve the leadership of surgeons, their teams, and trainees. The strengths of the study stem from adopting rigorous methodological guidance for realist reviews as described in the RAMESES quality standards [21] (see Additional file 1). Use of a realist approach has allowed us to place emphasis on how contexts influence outcomes and to focus on identifying generative mechanisms, thereby producing findings that are transferrable across different types of surgical leadership interventions. Whilst we have followed the realist review method and documented the steps that we took to arrive at our programme theory, we are fully aware that (in common with other qualitative research) this method is subjective, iterative, and interpretive, involving many more people than the core review team.

Our study limitations lie in the topic under consideration. The leadership literature is extensive (to crudely illustrate this, there were 330,583 hits for the term ‘leadership’ in MEDLINE at the time of searching). For this reason, we did not include articles which focused on distal outcomes outlined in our initial programme theory (e.g. patient outcomes, or organisational change) [26]. This decision was ratified through discussions with the wider research team and expert stakeholders who concluded that improvements to patient safety may result from advancing leadership but would only ever be considered as indirect evidence.

Our decision to limit distal outcomes was not just a pragmatic choice to prevent being overwhelmed by literature, but a methodological one. The further we extend outcomes, the less confident we can be that the cause can be attributed to the leadership intervention. Organisational and patient level outcomes are likely more affected by a complex set of variables. Hence, our design to limit our study to outcomes that are proximate to the setting in which leadership intervention took place, the surgical unit. Nevertheless, our decision to reduce our focus had consequences for the review as it limited our ability to fully achieve our initial aim, which was to understand how surgical interventions work and generating a strong evidence base to support use of surgical interventions. We may have excluded effective leadership interventions due to their setting, for example interventions delivered at a national scale, such as those programmes offered by the NHS Leadership Academy in the UK [73]. That being said, we recognise that studies which have sought to establish causal links between leadership and performance—whether through improved patient outcomes or organisational change—have been criticised as circumstantial or anecdotal [66]. Having completed the realist review, it would be futile to argue against this narrative. We are in a stronger position to demonstrate the nonlinear relationships between leadership interventions and leadership improvement in surgery. Our programme theory highlights the complexity of the conceptualisations of leadership identified in our review. Our theory is inherently complex and indirect. In developing this work, we have produced evidence which highlights the limitations of expecting to see a causal relationship between the implementation of a leadership intervention—and service improvement; the approach often adopt by the NHS and outlined in the NHS Long Term Plan [4].

In secondary research, the resulting synthesis is only as good as the primary data on which the synthesis is built. Whilst most included studies were of medium and high quality, a major limitation we encountered in our review was that most primary studies included only insight into the impact of surgical leadership interventions on individual leadership in isolation and did not consider the impact on the team or wider department. Because of this limitation in the literature scope, we adopted an individual skills-based framework to summarise the results in Table 4. Whilst this framework facilitates simple presentation of the results, it highlights a wider problem of how narrowly leadership is often conceptualised in surgery. We also found that primary studies often did not provide enough interventional detail to determine fidelity and understand how different aspects of leadership interventions influenced which outcomes and why.

In some cases, our CMOCs had gaps that could not be filled via the literature; most notably, this included a lack of mechanisms evidenced in the literature. This is a not uncommon when conducting realist reviews, and something faced by Price and colleagues in their review of patient safety [28]. They found mechanisms were underdeveloped in the literature. In that research they investigated the processes of meetings and e-mail exchanges to verify and explicate mechanisms with their stakeholder group. A strength of our review is the extent to which we consulted professional and academic experts in this field to review and refine and fill the gaps of our CMOCs.

Ideally, we would have conducted the stakeholder consultations in person as this can help build trust and rapport between participants. However, doing the consultations online was crucial to ensure surgeons could attend and fit the consultations around their work. Further, we would have liked to have consulted a more diverse range of stakeholders, and we acknowledge, for example, that we did not have any female surgeons or early-stage surgical trainees (e.g., training years 1–7) sharing their views. We acknowledge that doing this may have led to changes in the CMOCs. We did invite a range of surgeons and surgical trainees to our stakeholder events, but many declined due to availability.

Future research directions

It is clear that effective leadership can be important for the surgical profession, but it is by no means a panacea for success. An overarching finding of our review was the lack of literature which examined the entire surgical profession (i.e. not just the individual surgeon) and leadership in surgery more generally, i.e. not confined to an operating theatre. Most literature we identified positioned the surgeon as the target of the intervention and very few studies described the mechanism through which leadership was be improved. This finding is evidence to support additional qualitative research which seeks to explore and illuminate mechanisms, to unpack the ‘black box’ that is leadership improvement.

Whilst it is in important to understand what works in surgical leadership, we also need to understand the changing context in which leadership plays out, both in and outside of the operating theatre and beyond into the hospital and wider surgical community. We suggest that it is the enactment of leadership in context which will become important for improving leadership. Therefore, future research should consider surgical leaders embedded within surgery teams as a unit of analysis. We found that most leadership interventions are not grounded in theory, or evidence. For example, we suggest that the AMO (ability, motivation, and opportunity) theory may be well suited to design interventions which improve surgeons’ performance in the context of practice [11]. We therefore, recommend that academics and clinicians developing, testing and implementing leadership interventions in practice, adopt our programme theory as evidence of what works in surgery, and that researchers perform primary studies which extend our programme theory and further refine our CMOCs.

Conclusions

In healthcare, evidence-based practice reigns. Yet, when it comes to leadership development in surgery, the same approach to building and adopting the evidence-base in practice falters. Investment in leadership development in healthcare is substantial. To see a return on this investment we need to ensure that the interventions we implement in practice to improve the leadership of surgeons and surgical teams are evidence based and theoretically informed.

Our realist review identified 19 CMOCs which are the starting point for this evidence base. We used the CMOCs to develop the first programme theory to explain in which context and for whom leadership interventions in surgery work and why. The programme theory provides evidence-based guidance for those who are conducting research on leadership in surgery or who are planning or designing evidence-based leadership interventions in surgery.