Background

Labour induction refers to the process of artificially stimulating the uterus to begin labour [1], which is an increasingly common procedure. Cervical status, measured by the Bishop score [2], is a good predictor for the outcome of labour induction. If the cervix is unfavourable, no method is highly successful, and a ripening process is generally employed to obtain cervical effacement and dilatation prior to induction [3,4,5]. Methods used for cervical ripening can be broadly divided into mechanical devices and pharmacologic options [6, 7]. Compared with pharmacologic agents, mechanical methods, which were the first methods developed to ripen the cervix or induce labour [8], have similar levels of effectiveness but incur fewer episodes of adverse events (such as uterine tachysystole), have lower costs and are easier to preserve [6].

The balloon catheter, including both double- and single-balloon catheters, appears to be a widely accepted mechanical method and is recommended by the WHO for the induction process [9]. The original version of the Foley (single-balloon) catheter was initially described by Barnes in 1863 but was not described again until 1967, by Embrey and Mollison [10]. In 1991, Atad described the first double-balloon variation [11]. The Cook Cervical Ripening Balloon (CCRB), which uses an identical mechanism to that of the Atad catheter, was approved by the United States Food and Drug Administration (USFDA) in 2013 [12]. Only the double-balloon catheter (either Atad or Cook) is specifically designed and licensed for labour induction, while the Foley catheter is used beyond instructions.

Mechanical ripening devices apply pressure to the internal face of the cervix, directly overstretching the lower uterine segment and indirectly increasing the localised secretion of prostaglandin [13]. In addition to the local effect, mechanisms that involve neuroendocrine reflexes (such as the Ferguson reflex) may promote the onset of contractions [14]. Purportedly, the double-balloon (either Atad or Cook) option has an additional cervico-vaginal balloon, which applies greater pressure to both sides of the cervix and avoids the need for traction [11].

Given the increasing induction rate, the knowledge of even small differences between methods could be useful, not only to guide clinical practices but also to further explore the mechanism underlying the mechanical induction of labour and may promote a better understanding of the optimal methods for labour induction. However, studies examining the superiority of the double-balloon catheter reveal mixed results [15,16,17,18,19,20,21,22,23]. We conducted this meta-analysis and systematic review using the best available evidence to assess the efficacy, efficiency, safety and patient satisfaction of double-balloon catheters in comparison with those of single-balloon devices among women who underwent labour induction with unfavourable cervixes.

Methods

Search strategy

Together with a clinical librarian (R.O.), an electronic literature search was conducted with the PubMed, EMBASE, OVID, SCI (via WOS), CENTRAL (The Cochrane Central Register of Controlled Trials) and ClinicalTrial.gov databases from inception through ebruary 14, 2018. The searching strategy was based on the PICOS principle, utilising medical subject headings and Boolean logic-based free-text combinations of the following search items: “labour induction”, “cervical ripening”, “balloon”, “Foley”, “Cook” and “Atad”. In addition, we used sensitivity-maximising search filters to identify randomised controlled trials [24]. With the abovementioned databases, several meta-analyses and systematic reviews were identified. Aiming to identify more pertinent meta-analyses or systematic reviews, an additional search was performed in the CDSR (Cochrane Database of Systematic Reviews) database. All of the reference lists from the relevant reviews were manually retrieved to locate further eligible trials. There were no language restrictions. Differences of opinion were resolved by team discussion.

Study selection and data collection

All related RCTs that directly compared the double-balloon catheter with the single-balloon catheter for the purposes of labour induction or cervical ripening were included in the analysis. There were no restrictions with regards to settings, demographics, obstetrics characteristics (e.g., race, maternal age, and gestational weeks) and outcome measures. We excluded the following types of studies: (1) studies of balloon catheters used for outpatient purposes; and (2) protocols, observational studies, and secondary analyses of previous studies and guidelines. Prior to the formal review process, we performed pretesting with the kappa statistic to calculate the level of agreement between the inclusion/exclusion decisions of different reviewers and adjusted our criterion until kappa ≧ 0.75.

To improve the precision of the collected data, two reviewers (X.Y.L., Y.W.), one who majored in obstetrics and one who did not, screened each record for eligibility and independently extracted and tabulated the following information from the text, tables, and graphs: lead author; publication year; country of origin; study design; participants and intervention characteristics; outcomes; and sponsor. Prior to determining the categories for the data collection forms, a pilot test was performed using representative samples of the studies to be reviewed. All of the collected data are available upon request.

Due to the uncertain benefits of blinded assessments and the large workload, we did not conceal the general contents of the studies during this process. Any disagreements were resolved through discussion, or if necessary, through consultation with a third reviewer (F.Z.) who specialises in evidence-based medicine. When information regarding any of the extracted data points listed above was unclear, an attempt was made to access further details by contacting the authors of the original reports.

Selection of outcomes

The primary and secondary outcomes were defined before trial retrieval was performed. The primary outcome was the caesarean delivery rate. The secondary outcomes included: (1) catheter placement (placement difficulty/failure, spontaneous expulsion); (2) intervals (insertion to delivery, insertion to expulsion/removal, expulsion to delivery); (3) Bishop score increment; (4) vaginal delivery (vaginal delivery within 24 h, normal vaginal delivery, assisted vaginal delivery); (5) analgesia usage; (6) maternal adverse events (death, infection, postpartum haemorrhage); (7) neonatal adverse events (death, low Apgar score, NICU admission); (8) length of hospitalisation; and (9) satisfaction (pain during the process, maternal total satisfaction). While we attempted to collect all of the above datapoints from all of the analysed studies, only those that provided all of the data appear in the analysis tables.

Quality assessment

Two independent investigators (X.Y.L., Y.W.) openly (not blinded) assessed the methodological quality of the included RCTs based on Cochrane risk-of-bias tool. Quality was graded based on the following criteria [25]: (1) high quality: both randomisation and allocation concealment were assessed as having low risks of bias, and all other items were assessed as having low or unclear levels of risk; (2) low quality: either randomisation or allocation concealment was assessed as having a high risk of bias, regardless of the risk levels of other items; and (3) moderate quality: trials did not meet the criteria for high or low quality. Discrepancies were resolved by consensus.

Statistical analysis

All statistical analyses were performed with RevMan version 5.3, with the help of a statistician (X.N.Z.). The relative risks (RRs) and mean differences (MDs), with corresponding 95% confidence intervals (CIs), were used to describe the intervention effects for dichotomous and continuous variables, respectively. All potential data conversions utilised standard formulae recommended by the Cochrane Handbook [24].

Heterogeneity was identified by Cochrane’s Q test and the I2-statistic test, in which a Q test p-value < 0.1 and an I2 value ≥50% indicated significant heterogeneity. When both the p-value and the I2 value displayed no heterogeneity, we chose the fixed-effect model. Else, a random-effect model was used.

Subgroup analysis was pre-specified and performed on parity. A sensitivity analysis was conducted to identify studies involving data conversions that may have exerted a disproportionate influence on the pooled estimates. We assessed publication bias by examining funnel plots for the primary outcome only.

Results

Study characteristics

The literature search and screening process is shown in Fig. 1. Initially, 1326 potentially relevant records were identified. The titles and abstracts were reviewed, and 12 relevant trials were further screened. After thorough investigation, 7 RCTs, containing 1159 women and available data (577 and 582 in the double- and single-balloon groups, respectively), were determined to be eligible for inclusion [15,16,17,18,19,20,21]. The characteristics of the included trials are summarised in Table 1. Table 2 shows the risk of bias and the corresponding quality of each individual trial, which is illustrated in Fig. 2a and b. Basic demographic and obstetric variables are presented in Table 3. Except for postdates, which only two studies reported and which show slight heterogeneity, all other variables were comparable.

Fig. 1
figure 1

Literature search and screening process

Table 1 Characteristics of the included trials
Table 2 Risk of bias and corresponding quality
Fig. 2
figure 2

a Risk of bias graph. b. Risk of bias summary

Table 3 Basic demographic and obstetric variables

Of the 7 RCTs, 3 trials [15, 18, 21] focused on nulliparous women, while 2 trials [17, 20] conducted subgroup analysis by parity. These 5 trials, which included 781 women (595 nulliparous and 186 multiparous), were suitable for parity subgroup analysis.

Effects of interventions

All trials reported the rates of caesarean section. There were no significant differences in the rates of caesarean delivery (RR, 0.88 [0.65, 1.2]; p-value, 0.43) among trials, but heterogeneity existed (Q p-value, 0.04; I2, 55%) (Fig. 3a). A corresponding funnel plot is shown in Fig. 3b. During sensitivity analysis, heterogeneity disappeared only when Salim 2011 [20] was excluded (Q p-value, 0.11; I2, 45%), while the pooled effect was always robust (no significant differences). The secondary outcomes, shown in Table 4, did not differ obviously between the two types of catheter, except for the Bishop score increment (MD, 0.57 [0.28, 0.86]; p-value, 0.0001).

Fig. 3
figure 3

a Forest plot of cesarean delivery. b. Funnel plot of cesarean delivery

Table 4 Secondary outcomes

Subgroup analysis results by parity are shown in Tables 5 and Table 6. Only the Bishop score increment in nulliparous women exhibited a statistically significant difference; however, heterogeneity was demonstrated among studies (MD, 1.08 [0.38, 1.78]; Q p-value, 0.11; I2, 56%; p-value, 0.002), suggesting that the double-balloon catheter may have a greater ability to increase the Bishop score. Unless otherwise highlighted, studies were homogeneous, and sensitivity analysis displayed no meaningful changes.

Table 5 Outcomes by parity (nulliparous)
Table 6 Outcomes by parity (multiparous)

Discussion

Summary of main results

Efficacy and efficiency

Balloon catheters were initially designed for cervical dilatation and ripening during labour induction. The best indicator of efficacy is the Bishop score increment. However, when correlated with baseline data, the Bishop score served only as a secondary outcome. No significant differences were observed for obstetric characteristics (including the Bishop score before catheter insertion) between women treated with the single-balloon catheter and those treated with the double-balloon catheter. Therefore, we could use the Bishop score after catheter removal (the second Bishop score) to roughly calculate this effect size, and it was not necessary to perform covariance analyses to adjust the baseline data. According to our analysis, the double-balloon catheter increases the Bishop score more significantly, especially for nulliparous women. However, this result was not observed for the multiparous subgroup. In support of this finding, one study [17] reported a Bishop score > 6 at balloon removal, and a similar trend in was observed for both general and subgroup subjects. Additionally, the ripening success rates (defined by the individual articles) appeared to be higher in the double-balloon groups, but without enough statistical power to determine significance [16, 19, 22, 23]. Atad et al. also reported similarly large average increments in the Bishop scores for both nulliparous and multiparous women for the double-balloon catheter, without a single-balloon catheter comparison group [11]. Later, the researchers reported that the Bishop score increment when employing the single-balloon catheter was lower than that achieved by the double-balloon catheter, with a higher failure rate [26].

Efficiency, best evaluated by the interval length and the 24 h delivery rate, is comparable regardless of parity. In the double-balloon catheter group, the interval from insertion to delivery appears to be longer, while the interval from expulsion to delivery appears to be shorter, though neither measure achieves significance. Ahmed, et al. [15] stated that women treated with a single-balloon catheter had a shorter insertion to amniotomy time (p = 0.02) than women treated with a double-balloon catheter, while Pennell, et al. [18] found that the length of labour did not significantly differ (p = 0.152) between the two groups; there is little consensus on the time from insertion to active labour, with Pennell, et al. [18] preferring the single-balloon catheter (p = 0.014), while Rab, et al., [19] demonstrated no obvious differences. Ahmed and Mei-Dan [15, 22] suggested that the shorter interval between insertion and expulsion for the single-balloon catheter likely resulted in the observed shorter induction to delivery interval, although the second Bishop score was lower in this group.

The frequency of placement difficulty or failure and spontaneous expulsion are similar between the two groups. In addition, Salim, et al. [20] found that women who spontaneously expelled their catheter demonstrated favourable outcomes with regards to shorter times from induction to delivery (1.10 [1.06–1.15]; p = 0.001) and a significantly lower proportion of operative deliveries (2.15 [1.26–3.69]; p = 0.003).

Safety

Both maternal and neonatal adverse events are of great concern. Although we hoped to consider mortality data, no study provided this information. Other measurements were also equivalent, including maternal infection, postpartum haemorrhage, low Apgar scores and NICU admissions. Some studies also [18, 20] reported placental abruptions, uterine hyperstimulation, cord prolapse, malpresentation, and Apgar < 4 at 1 min, with no significant differences between groups.

Satisfaction

Patient-reported outcomes (PROs), such as maternal satisfaction, represent what is most important to patients about a condition and its treatment [24]. However, few reports related to PROs were found. Here, we can report patient satisfaction based on two original reports [15, 19], both evaluated by the visual analogue scale (VAS) [27], with identical measurement times and protocols. The pooled results of these two studies suggests similar satisfaction levels for the two catheter types.

Comprehensive outcomes

Delivery modes, which are of particular clinical concern, represent a comprehensive measurement of the effectiveness and safety of labour induction protocols and can incorporate economic evidence. Caesarean section delivery is the most frequently used outcome pre-specified by trials. According to our analysis, no strong evidence exists to demonstrate which mechanical device is more effective, and heterogeneity exists among studies. Similarly, both normal and assisted vaginal delivery rates were comparable between groups, regardless of parity, as were the rates of analgesia usage during the ripening process and the lengths of hospitalisation.

Heterogeneity

Heterogeneity exists in many results, which may be the result of differences in study design or quality, participants, interventions, demographic feature or local policies. During our heterogeneity test, three studies [17, 19, 20] were potential candidates for being the sources of heterogeneity. Unlike other studies, Rab et al. [19] enrolled women who had experienced a stillbirth and had scarred uteri, which could be responsible increasing the general heterogeneity. Additional differences among these studies involved parity and balloon volumes (discussed below).

Applicability of evidence

Guide clinical Practice

Despite the fact that the double-balloon makes results in more favourable Bishop scores, it appears to result in prolonged intervals. No differences were observed for delivery mode, which is the most meaningful obstetrical outcome. As for the economical consideration, it is mainly related to hospitalization length, delivery mode and device itself. What is noteworthy is that the single balloon (Foley catheter) is approximately 30–40 times cheaper than the double-balloon catheter at different institutions, and the difference of price varies from countries to countries. As the producers offered that Foley catheters cost approximately $1.12, while Cook catheters cost approximately $39.33. As for our hospital in China, the single balloon catheter costs 20–30 RMB while the double one costs 600 RMB, and the price for placing balloon catheters is about 600 RMB in both situations. Considering the fact that caesarean section and hospitalization length were similar in the two groups, and when coupled with a substantial price differences in the devices, the single balloon catheter seems like to be more cost-effective for labour induction, particularly in low resource settings.

Exploring the mechanisms

Practically, in our hospital, we prefer to place a balloon catheter at night, avoiding expulsion due to daily activity. Thus far, no studies have focused on this issue as a potential mechanism for labour induction. Theoretically, the insertion of a foreign object could increase the risk of intrauterine infections; however, the limited data from our analysis and previous studies did not show any evidence that the cervical ripening balloon catheter contributes to increased infection occurrences [6, 18, 20, 22, 28,29,30,31]. More studies are required to address the effects of the balloon-catheters on the rupturing of membranes and infection. In addition, physiologic differences in the mechanism through which balloon catheters induce labour according to parity also must be assessed.

Prior research demonstrated that a Bishop score > 5 was associated with a greater likelihood of vaginal delivery [32, 33]. Although a higher Bishop score was achieved in the double-balloon group in our analysis, there were no differences in the vaginal delivery rates between the two groups. This result interested us, and we hypothesise that there may exist a threshold for the Bishop score that, once achieved, no further effects will be generated; after this threshold is met, the level of hormone secretions takes precedence over cervical conditions. Similar what is observed in our practice, favourable outcomes are rarely observed with balloon usage alone, unless augmentations (e.g., prostaglandin or oxytocin) are utilised.

The larger volume, the application of pressure on two sides (harder expulsion), and the ability to abandon traction when using the double-balloon catheter may explain the observed outcomes. The larger volume balloon may increase the separation between the amniotic membranes and the uterine decidua, resulting in an increase in the local secretion of prostaglandins and enhancing the cervical ripening process. Though 60 ml and 80 ml Foley catheters are more effective than 30 ml catheters [34,35,36], 80 ml + 80 ml Atad or COOK balloons do not demonstrate superiority to smaller Foley catheters, which may be due to other factors (e.g., traction). We hypothesise that traction may have a greater effect on the induction of labour and that the one-sided application of pressure may interfere with the labour pattern less than two-sided pressure. In theory, traction may cause discomfort for patients. However, this finding has not been confirmed by our analysis. Instead, speculum application prior to catheter insertion, which followed the same procedure in both groups, appears to be the main source of discomfort [15].

Further studies are required to investigate the possible biological mechanisms on cervical ripening and the sources of discomfort, to provide practice guidelines and instrument improvement.

Identifying the optimal methods for various populations

Although there were no restrictions on settings, demographics or obstetrics characteristics, the participants from all of the included studies, except for Rab [19], were women with viable singletons and without scarred uteri, making the applicability of our evidence limited. Vaginal birth after caesarean delivery (VBAC) has received increasing attention [37], but identifying the optimal method for labour induction in this specific population remains controversial. Pharmacological methods are often rejected in VBAC women because of greater risks of complications. However, whether balloon catheters can and should be utilised in women with scarred uteri, which manufacturers do not recommend, requires further studies. In addition, twins and other multiple pregnancies are contraindications for the use of balloon catheters, despite the increased frequency of multiple pregnancies. Whether balloon catheters can be used in situations with multiple pregnancies also deserves further study.

Strengths and limitations

In the current meta-analysis, no demographic or obstetric characteristics were restricted, which increases the applicability of the evidence. We performed evaluations examining evidence of bias and applied quality grades strictly based on the original reports and the Cochrane handbook. The 7 included trials are all rigorous in design, enabling the appraisal and interpretation of their results. Additionally, because bias is more important for studies with subjective events and positive results than for studies with negative results and objective outcomes, such as our analysis, it was acceptable to assume that bias would not practically undermine the results of our analysis.

When extracting data, some outcomes with various forms required data conversions, which likely led to analytical bias. Although we conducted sensitivity analysis specifically to test this possibility, it cannot be clearly determined whether these conversions influenced our outcomes. In addition, the outcomes we chose for this analysis are widely used in practice to avoid potential inconsistencies, and appropriate subgroup analyses were performed to identify potential sources of heterogeneity; however, heterogeneity remained too comprehensive to analyse fully.

The sample size of the current analysis had adequate power for the evaluation of the primary outcome. For some secondary outcomes, fewer data points were available, which may result in insufficient power and higher risks of publication bias. To minimise this bias and to involve more relevant studies, we have done our best to search databases using a wide range of publication years, to consider potentially eligible reviews and to fully utilise trial registration databases, with sensitivity-maximising search filters. Unfortunately, we are still incapable of accessing conference abstracts or proceedings and grey literature. Thus, publication bias cannot be excluded completely, and caution should be taken.

The procedures performed during our analysis to reduce bias and assess risks can provide direction for further research, although not all of these are necessary.

Conclusions

Both kinds of balloon catheter perform similarly with regards to efficacy, efficiency, safety and patient satisfaction. The single-balloon device appears to be more economical and practical, particularly in low resource settings.