Introduction

One in 25 people undergo a surgical procedure every year [1]. Surgery is intended to save lives but unsafe surgical care can cause substantial harm; complications after inpatient operations occur in 25 % of patients and the reported crude mortality rate after major surgery is 0.5–5 % [2]. At least half of the cases in which surgery leads to harm are considered preventable [3]. Most surgical errors are caused by failures of non-technical skills such as communication, leadership and teamwork [4].

In 2008 the World Health Organization (WHO) developed a surgical safety checklist (SSC), in an attempt to minimise surgical adverse events [2]. The three phase 19-item checklist comprises various perioperative items directly targeted to assure execution of specific safety measures. The mechanism by which the checklist is said to improve surgical outcomes involves both direct and indirect means. Direct factors such as ensuring timely administration of prophylactic antibiotics may result in decreased rates of postoperative infections. Indirectly, the checklist is reported to increase the ‘safety culture’ in operating theatres and thus decrease non-technical surgical errors, resulting in a positive effect on all postoperative adverse events [59].

The checklist has been implemented as a standard of care into thousands of operating rooms worldwide as it is relatively easy to implement and unlikely to cause harm [10]. However, there is emerging evidence that for the checklist to be effective it requires a deliberate implementation process, continual monitoring and learning within frontline teams [11]. It is thus necessary to determine the effects of the checklist on postoperative outcomes to validate this continued effort. Furthermore, the checklist may become a routine activity of checking of boxes without actually driving behavioural change thus giving staff a false sense of security [1214].

Previous literature reviews have all suggested an apparent reduction in postoperative adverse events following the implementation of the checklist; however, all have concluded that higher quality studies are needed [1521]. Since the last published review, many large-scale studies have been published, including two randomised controlled trials (RCT) [2226]. Hence there is a need for an updated systematic review of the SSC. This systematic literature review examines the effects of the implementation of the WHO SSC on postoperative complications and mortality.

Methods

Protocol and registration

This systematic review is reported using the Preferred Reporting Items for Systematic Reviews and Meta-Analysis (PRISMA) guidelines [27]. The review focuses on studies with primary quantitative data on the effects of the implementation of the WHO SSC on postoperative adverse events. The review was registered in the PROSPERO database, reference number: CRD42015024373.

Search criteria

A literature search of publications published from 2007 to June 2015 was conducted. Two investigators (EdJ and CM) searched MEDLINE, CINAHL, Scopus, Cochrane and ProQuest databases using the following search strategy; (WHO OR World Health Organisation OR World Health Organization) AND checklist AND (surgery OR surgical OR operative). The date last searched was June 4th 2015. Reference lists of relevant studies were searched by hand to identify additional publications. Authors of select studies were contacted to find additional information. The two investigators screened the titles and abstracts of potential studies, and full text potential studies were reviewed where necessary.

Eligibility criteria

Included studies incorporated a population of patients undergoing surgical procedures, in which the WHO SSC was implemented, compared to a control group where the checklist was not used or a control group with low compliance to the checklist. The outcomes were quantitative data on postoperative complications or mortality, however defined by the authors. Postoperative pain, urinary tract infections, nausea and vomiting were not considered significant postoperative complications.

Studies were excluded if they were not written in English or did not use the WHO SSC or an adaption of the WHO SSC. Studies were also excluded if the intervention concurrently consisted of a bundle of action such that the sole effect of the safety checklist could not be isolated, for example, where pulse oximetry was introduced alongside the implementation of the checklist.

Data extraction and analysis

The two investigators used a standardised data sheet to extract data from included studies. Data were extracted for study setting, design and duration, sample size, surgical procedures included and quantitative patient outcomes. Postoperative complication and mortality rates were extracted. Two authors independently performed data extraction and a third review author adjudicated any discrepancies (LB). The included studies were deemed unsuitable for Meta-analysis since they were too heterogeneous and mostly observational studies.

Quality

Randomised controlled trials were assessed using the Cochrane RevMan Risk of Bias tool [28]. Non-randomised controlled trials were assessed using a modified version of the previously validated Methodological Index for Non-Randomised Studies (MINORS) [29]. The original 12-item index had two items removed by authors, item six and seven. A similar modification has previously been reported [16]. These items relate to an adequate duration of follow-up after the implementation of the checklist. There is currently no consensus about the most appropriate duration of follow-up. There may be an increased emphasis of surgical safety and higher levels of compliance to checklist use early after the intervention, resulting in falsely encouraging outcomes in studies with short follow-up periods. Alternatively, the checklist-induced cultural change may take time to develop and thus studies with a short follow-up period may not show the full effects of the checklists' use. As such, an appropriate length of follow-up could not be defined.

Results

Search results

Database and reference list searches yielded 509 articles, of which full text of 109 articles were examined. Based on the inclusion and exclusion criteria, 25 studies were included (Fig. 1; Table 1) [27].

Fig. 1
figure 1

Flow diagram showing identification of studies for inclusion in a systematic review of the effects of the WHO SSC implementation of postoperative adverse events

Table 1 Characteristics of included studies (statistically significant results bolded)

Quality assessment

Two studies were RCTs, 13 were prospective observational studies and 10 were retrospective cohort studies. The mean Cochrane RevMan score for the two RCTs was nine out of a possible 14. The mean score on the modified MINORS tool was 14 (SD 3.6) out of a possible 20. Each item assessed by these scores may not be equally important. Hence, we refrained from presenting a sum score for individual publications and instead demonstrate the individual components of the scores in a Cochrane risk of bias figure (Figs. 2, 3) [28]. Four studies had a concurrent control group; the remaining studies were largely a pre- and post-implementation group comparison. Several studies did not have adequately matched cohort groups, with differences in the emergency status of the surgery, surgical specialty and patient characteristics.

Fig. 2
figure 2

Risk of bias assessment using Cochrane RevMan criteria for randomised controlled studies

Fig. 3
figure 3

Risk of bias assessment using MINORS criteria for non-randomised studies

Many studies did not report doing a sample size calculation. Studies that did do a sample size calculation often calculated these to report significant total pooled complication rates rather than specific postoperative complications. This contributed to many studies being reported underpowered to reach statistical significance for specific postoperative outcomes.

Risk of bias of included studies

Some generalised potential sources of bias and confounding included that various implementation approaches were used; teamwork-training initiatives themselves may have confounded the post-checklist data [30, 31]. High levels of communication and collaboration are associated with overall lower rates of morbidity [32]. Bliss and colleagues reported a statistically significant decrease in postoperative complications from 23.9 to 15.9 % after three teamwork-training sessions; this was further reduced to 8.2 % after the checklist was adopted [33].

The WHO recommends that local stakeholders alter the checklists. Hence the specific checklists used often vary. This may impact rates of specific postoperative complications and make it difficult to compare studies. The definition of postoperative complications and specific postoperative outcomes also varied between studies making comparison between studies difficult.

Many studies used direct observation to evaluate compliance, potentially leading to a Hawthorne effect where non-technical skills such as communications and leadership increased with the intervention not because of the intervention.

Surgical adverse events rates are influenced by many factors; whilst studies attempted to adjust for known confounders it is likely that there are unknown confounding factors that were not adjusted for. Most of the reviewed studies did not have a concurrent control group and unknown confounding factors likely impacted the interpretation of their results. As the use of the checklist is seen as best practice, it may be unethical to withhold its use in a clinical setting. In addition to this when concurrent control groups are used the contamination effect must be considered, especially for indirect effects of the checklist such as enhanced leadership, teamwork and the resultant improvement in ‘safety culture’.

Two randomised controlled trials

Chaudhary et al. randomised 700 patients to checklist use or omission in a hospital in India. Patients were blinded to the study whilst the treating teams were not and as such contamination effects may significantly affect the study’s results. Mortality, bleeding, abdominal and wound-related complication rates decreased significantly with the use of the checklist. The total complication rates, number of complications per patient, length of hospital stay, rates of sepsis, respiratory, renal and cardiac complications did not change [26].

A larger stepped wedge cluster randomised control trial with a sample size of 4475 was conducted in two hospitals in Norway. In this study, the checklist intervention was sequentially rolled out across five surgical specialties in a randomised order. As such the cohorts were not adequately controlled; there was a discrepancy in surgical specialty and type of anaesthesia used between cohorts and the intervention group was more likely to undergo emergency surgery. In addition to this, 25.6 % of the procedures allocated to the intervention step were not compliant with the checklist and results of these surgeries were excluded. The reasons for non-compliance were not assessed and this is a likely source of bias. The rates of total complications, unplanned readmission to theatre, infectious complications, pneumonia, haemorrhage, respiratory and cardiac complications significantly decreased, whilst mortality, sepsis, surgical site infections and thromboembolic complications did not significantly change [23].

When results of the two randomised control trials were compared, the only outcome that was significantly decreased in both studies was postoperative bleeding rates.

Developed vs. developing countries

A sub-analysis was done whereby studies were divided into developing and developed nations as classified by the World Bank classification [34]. Multinational studies that did not differentiate between high- and low-income countries were not included in the sub-analysis. In developed countries, 36 % of studies (5 [23, 33, 3537] out of 14 studies [6, 2225, 33, 3542]) showed a significant decrease in total complication rates compared to 83 % of studies (5 [38, 4346] out of 6 studies [26, 38, 4346]) conducted in developing nations. Mortality was not decreased in any of the 13 studies in developed nations [6, 2225, 35, 37, 38, 41, 42, 4749], whereas it was decreased in 75 % of studies (3 [26, 38, 45] out of 4 studies [26, 38, 45, 46]) in developing nations. Two studies reported an increase in mortality or complications; both of these studies were in developed nations [35, 39]. Thus in reviewed studies, the effect of the checklist seems to be greater in developing nations.

Total complications

The total complication rate was reported in 20 studies [6, 2226, 33, 3541, 4346, 49, 50], ten reported significantly decreased rates (range 34–67 %) [23, 33, 35, 37, 38, 4346, 50] and one reported increased complication rates (25 %) [39].

Mortality rates were reported in 18 studies [6, 2226, 35, 37, 38, 41, 42, 4551]; four reported a significant decrease in rates (range 43–100 %) [26, 38, 45, 50], whilst one reported an increase following the implementation of the checklist (238 %) [35].

Length of admission was examined in four studies [22, 26, 39, 40]; one reported a statistically significant but clinically insignificant decrease in length of stay by 0.04 days (p = 0.003) [22].

Unplanned return to the operating room was examined in eight studies [6, 2224, 36, 38, 44, 47]; four found a significant decrease in rates (range 8–67 %) [22, 23, 38, 44].

Wound related complications

Surgical site infections were examined by 14 studies [6, 2224, 33, 35, 36, 38, 4346, 48, 50], four showed a statistically significant decrease (range 41–85 %) [38, 45, 46, 50]. Wound dehiscence was examined by five studies; no significant changes were found [22, 24, 25, 33, 36]. Combined wound complications were examined by two studies; both found a decrease (46 and 61 %) [26, 36].

Haematological studies

Rates of deep vein thrombosis (DVT) and/or pulmonary embolism (PE) were examined by five studies [2224, 33, 36]; the only significant change was that one study reported an increase in DVT rates by 133 % [22].

Postoperative bleeding rates were examined by eight studies [2224, 26, 33, 36, 45, 50]; three found a significant decrease (range 34–82 %) [23, 26, 50].

Miscellaneous other

Total infection rates were examined in five studies [2325, 33, 36], rates decreased in two studies [23, 24]. Rates of sepsis were examined in six studies [2224, 26, 33, 35], rates decreased in one study [24]. Ten studies examined respiratory complications [2226, 33, 36, 38, 43, 44], one study found a decrease in rates of pneumonia and in total respiratory complication rates [23]. Another study found an increase in ventilation use [22]. Renal complications were examined in five studies [22, 24, 26, 33, 43], one found a decrease in acute renal failure [33], no other results reached significance. Cardiac complications were reported in five studies [2224, 26, 33], one found a significant decrease in total rates [23]. One study examined total abdominal complications, which showed a reduction in complication rates [26].

Wrong-sided surgery

Two studies reported rates of wrong-sided procedures [45, 52]. One study found a statistically significant decrease; one patient had a wrong-sided surgery before the implementation, and no patients after the checklist was implemented (1.38 to 0 %, p < 0.05) [45].

Studies with increased rates of adverse outcomes

Two studies showed an increase in postoperative complications and mortality after the implementation of the checklist. In both studies, the comparisons were unadjusted, precluding meaningful conclusions.

Morgan et al. examined the effect of checklist compliance improvement initiatives on surgical outcomes with using a concurrent control group for comparison. In the intervention group, postoperative complications significantly increased, whist in the concurrent control group complications decreased (21.5 to 26.8 and 27.1 to 25.7 %, p = 0.05). The study was limited by a small sample size which prevented risk adjustment for differing patient characteristics between the groups. Another limitation was that a direct observational model was used; this is vulnerable to the Hawthorne effect and contamination [39].

Boaz et al. conducted a retrospective review of surgical outcomes before and after implementation of the checklist. It included 760 orthopaedic surgery patients and found an increase in postoperative mortality (0.8 to 2.7 %, p = 0.049) following the checklists implementation. The study reported that the composite postoperative complication rates decreased (25.9 to 18.9 %, p = 0.02), this was not significant after controlling for confounding variables. The study's conclusion and discussion focussed on a significant decrease in postoperative fever after implementation of the checklist [35].

Discussion

A surgical safety initiative, which has been implemented into thousands of operating rooms around the world, in an attempt to decrease preventable postoperative complications, should have a strong body of evidence supporting its use. This systematic review found that the effects of the checklist on postoperative outcomes were inconsistent. There may be some benefit to the implementation of the WHO SSC, with this benefit appearing to be greater in developing countries.

There is a lack of significant evidence to explain this phenomenon; that the checklist is more beneficial in developing compared to developed nations. Contributing theories are largely speculative with a lack of significant evidence. Developing countries may have an inherently higher rate of baseline complications and thus have a larger latitude for improvement initiatives to have an effect. Another point to consider is that the checklist partially works by improving non-technical skills such as teamwork, leadership and communication. These factors have a large societal and cultural aspect which may differ between sites. It is also possible that facets of the checklist were already a standard of care in developed countries prior to adoption of the checklist, reducing the effects of the checklist.

Rates of surgical adverse event outcomes are not independent. Postoperative complication rates are associated with postoperative mortality rates [53]. The checklist aims to reduce preventable surgical error and should decrease rates of specific postoperative complications, total surgical complications and postoperative mortality. Outcomes such as the length of stay should also decrease, as these are indirect measures of the postoperative complication rates [54]. The reviewed literature did not show congruency amongst outcomes of surgical adverse event rates. For example, Chaudhary et al. reported that postoperative mortality reduced significantly (by 43 %), whilst there was no significant change in total postoperative complication rates [26]. This phenomenon was observed both within some studies, and when all significant results from the reviewed literature were compared.

An effective safety improvement initiative should have consistent effects on outcomes. The effects of the checklist were inconsistent; this was evident within multicentre studies where the effect of the checklist often varied dramatically between sites. For example, Hayes et al., found significant decreases in postoperative adverse event rates in three of eight sites; the remaining five sites did not have any significant changes in outcomes [38]. The reported benefits of the checklist were from pooled data of all sites. Similarly Urbach et al., examined the effects of the checklist at 101 hospitals, of these six had a significant decrease in adverse event rates, three had a significant increase in adverse event rates and 92 sites had no significant changes in outcomes [22]. Individual sites may not have been sufficiently powered to detect changes, leading to a type two error. Regardless of this factor the effect of the checklist on postoperative outcomes appears to be most variable.

Reviewed studies tended to report substantial improvements in complication rates (range 34–67 %), or show no significant change. Half of surgical complications are reported to be preventable [3]. Hence even if the checklist stopped all preventable errors, postoperative complications would only reduce by 50 %. A change larger than this is likely to have contributing confounding factors or be biased by a poor study design.

Another factor to consider is publication bias. An under-representation of studies showing negative or no effects is well documented; studies with results supporting a hypothesis have a 50 % higher likelihood of publication compared to studies with a negative or neutral outcome [55]. The focus on statistically significant findings was also observed within reviewed studies; with some authors emphasising specific postoperative outcomes that were improved by the checklist, neglecting to comment on the many outcomes that were not altered or increased with the use of the checklist [35].

The checklist may be too generalised as it is intended to be applied to all surgical disciplines. Some specialties have called for their own specific checklists to be created whilst others have proposed a checklist tailored to each specific operation [25, 5658]. Further studies are needed to determine the effects of specialty-wide surgical safety checklists.

Many of the studies excluded patients below the age of 16 or 18; there is thus a lack of literature reporting the effects of the checklist on a paediatric population. Younger patients may not be able to confirm identity, site or procedure and may lack the ability to give consent. Further studies on the effects of the checklist on a paediatric population are warranted.

A limitation of this review is that reported compliance to the checklist was not scrutinised. Measures of compliance are largely based on specific aspects of care embedded in the checklist. This may be an inappropriate measure of the ‘safety culture’, which the checklist is said to promote. Ticking all the boxes does not mean that the actions the checklist calls for have been completed. Some studies did not report compliance, when it was described there was marked variability in compliance between checklist items [16]. Many studies used data from administrative databases that may report higher rates of compliance than those reported by auditing observers [59, 60]. This heterogeneity makes it difficult to compare compliance rates between studies, and even more so to relate these to adverse event outcome measures in an attempt to draw any meaningful conclusions.

A further limitation is that a meta-analysis was not conducted. Combining observational studies of heterogeneous quality may be highly biased. Included studies had a very diverse patient population and sample size. One study had a larger sample size than all other studies combined, because of this results of a meta-analysis would invariably be skewed to this study’s outcomes.

Conclusion

The WHO SSC has been widely implemented in an attempt to decrease preventable postoperative complications. This systematic literature review examined the effects of the implementation of the WHO SSC on postoperative adverse events. The review included results of three times as many studies as previously reviewed. The effects of the checklist on postoperative outcomes were inconsistent. With the observed lack of congruency between specific postoperative outcomes and the widespread lack of concurrent control groups, it is possible that many of the positive changes of the checklist were due to temporal changes, rather than the checklist itself. This is likely compounded by publication bias where studies reporting insignificant results are less likely to be published. There may be some benefit to the implementation of the WHO SSC and the benefit appears to be larger in developing countries. Further studies are needed to support the implementation and continued use of the checklist in thousands of operating rooms around the world.