Background

In split-mouth randomized controlled trials (RCTs) in oral health, experimental and control interventions are randomly allocated to different areas in the oral cavity (teeth, surfaces, arches, quadrants) [13]. As compared with parallel-arm RCTs, split-mouth RCTs have the advantage that most of the variability of outcome among patients is removed from the intervention effect estimate for a potential increase in statistical power, each subject being its own control [4, 5]. Because every subject receives each intervention, the design may also be better suited to determine patient preferences.

Many researchers in oral health research use the split-mouth design. Therefore, systematic review authors frequently include trials of both split-mouth and parallel-group designs to derive combined intervention effects. The rationale to include split-mouth RCTs is to use all the available evidence. However, the split-mouth design may lead to biased intervention effect estimates. For instance, carry-over effects (ie, contamination or “spilling” of the effects of one intervention from one site to another site) may induce bias in split-mouth RCTs [4]. If the interventions are delivered at different times, period effects may also influence intervention effects. Moreover, the statistical analysis of split-mouth differs from that of parallel-arm RCTs because the paired nature of data must be taken into account [68]. Failure to consider the difference between the two types of trials may result in unreliable inference because the confidence interval for the true combined effect will be incorrect. Lesaffre et al. suggested that intervention effect estimates from split-mouth and parallel-arm RCTs may not be the same and recommended separate subgroup meta-analyses of split-mouth and parallel-arm RCTs to investigate systematic differences [9].

In this meta-epidemiological study, we aimed to assess if data from split-mouth RCTs were incorporated appropriately in meta-analyses and whether intervention effect estimates differ between split-mouth and parallel-arm RCTs in meta-analyses.

Methods

We performed a meta-epidemiological study to compare intervention effect estimates between split-mouth RCTs and parallel-arm RCTs. We identified meta-analyses that included at least one split-mouth RCT and at least one parallel-arm RCT assessing a variety of conditions and interventions on binary or continuous outcomes. For each selected meta- analysis, we compared intervention effect estimates between split-mouth and parallel-arm RCTs. In a second stage, results were summarized across all meta-analyses.

Selection of meta-analyses, trials and outcomes

To identify eligible studies, we searched MEDLINE and EMBASE. Search equations for each database included the free-text word “split mouth” combined with a filter designed to identify systematic reviews (see Additional file 1) [10]. Second, we performed a full-text search of the Cochrane Database of Systematic Reviews (CDSR) through http://www.thecochranelibrary.com and archie.cochrane.org. We also searched the Database of Abstracts of Reviews (DARE). Third, we searched SCIRUS, a science-specific search engine covering full-text articles. The last search was conducted in February 2013, with no restriction on date or language.

Two authors independently and in duplicate screened the titles and abstracts of records retrieved by the search, and then screened the selected full-text reports. When the designs of selected trials were unclear in an abstract, we always screened the full-text article. Any disagreements were resolved by discussion.

Eligible studies were systematic reviews of therapeutic or preventive interventions that included at least one split-mouth RCT, as labeled by the review authors, and at least one parallel-arm RCT in quantitative syntheses (ie, meta-analyses). Updates of systematic reviews were selected rather than initial versions.

From each eligible systematic review, we selected all independent meta-analyses (defined as the comparison between specific experimental and control interventions on a given outcome). We excluded meta-analyses in which all RCTs had a split-mouth or parallel-arm design. Then we selected one binary or one continuous outcome, or both if present, corresponding to the previous criteria. In cases of multiple eligible outcomes, we chose the primary outcome as stated by the authors or selected the outcome with the largest number of studies. For each meta-analysis, we selected all individual RCTs and we excluded non-randomized studies. Finally, we identified overlapping meta-analyses (ie, with common RCTs) and excluded the meta- analysis that included fewer trials [11].

Data extraction

Two authors extracted the data in duplicate and independently, with discrepancies resolved by discussion. For each systematic review, we recorded the first author, publication year and studied population. For each meta-analysis, we recorded the experimental intervention, the comparator, the outcome and the number of split-mouth and parallel-arm RCTs.

We assessed the methods used by the authors for incorporating split-mouth RCTs into meta- analyses: we assessed the presence of subgroup analyses (ie, split-mouth RCTs and parallel-arm RCTs analyzed separately) and/or whether one quantitative synthesis combined split-mouth and parallel-arm RCTs; in this case, we assessed whether the techniques described by Elbourne 2002 or Lesaffre 2009 were used [9, 12]; moreover, we assessed whether the authors calculated the standard error of the intervention effect estimate in split- mouth RCTs using appropriate methods (ie, statistical approaches taking into account the paired nature of data; eg, the techniques described by Follmann) [13].

From the systematic review, for each RCT, we abstracted the first author and publication year and the design (split-mouth or parallel-arm). From the original RCT reports, we extracted the number of patients and, according to type of outcome, the means and associated SDs, or number of events, for both the experimental and control arms.

Statistical analysis

Binary and continuous outcomes were analyzed separately. For each RCT, we derived an intervention effect estimate and the associated sampling variance. Intervention effects were measured by odds ratios (ORs) and standardized mean differences (SMDs, or Cohen’s d). All comparisons were coded so that the experimental intervention was compared with the comparator for an unfavorable outcome. Binary and continuous outcomes were coded so that an OR < 1 and SMD < 0 indicated a beneficial effect of the experimental intervention, respectively. For binary outcomes, in cases of 0-count cells, we used a 0.5 continuity correction. If no events occurred in all arms, the RCT was excluded.

For split-mouth RCTs, we took into account the matched nature of data; marginal ORs were calculated by the method of Becker and Balagtas [12, 14, 15]; SMDs were estimated by taking into account the within-patient correlation coefficient [16]. We contacted the corresponding authors of all split-mouth RCTs to ask for the required matched outcome data. If not available from the reports and with no response from authors, we assumed a within- patient correlation of 0.5. Sensitivity analyses with correlation values of 0 and 0.25 yielded similar results.

For every meta-analysis, with more than one split-mouth or parallel-arm RCT, we calculated combined intervention effects and associated variances. We used both fixed-effects and random-effects (restricted maximum likelihood estimator) calculations. Results were similar, so we reported results obtained with fixed-effects primarily and those obtained with random-effects as sensitivity analysis.

We used a meta-epidemiological analysis to estimate the combined difference in intervention effect estimates between split-mouth and parallel-arm RCTs by a two-step method [17]. For each meta-analysis with a binary outcome, we estimated the ratio of the intervention effect for split-mouth RCTs to that for parallel-arm RCTs, the ratio of ORs (ROR): on a logarithmic scale, we derived log(ROR) = log(summary OR in split-mouth RCTs) - log(summary OR in parallel-arm RCT) and its variance Var(log ROR) = Var[log(summary OR in split-mouth RCTs)] + Var[(log summary OR in parallel-arm RCTs)]. Then we estimated a combined ROR and 95% confidence interval (CI) across meta-analyses by a random-effects meta-analysis model (restricted maximum likelihood estimator). For each meta-analysis with a continuous outcome, we estimated the difference in intervention effect estimates between split-mouth and parallel-arm RCTs, the difference in SMDs (∆SMD): we derived ∆SMD = summary SMD in split-mouth RCTs - summary SMD in parallel-arm RCT and its variance Var(∆SMD) = Var[summary SMD in split-mouth RCTs] + Var[summary SMD in parallel-arm RCTs]. Then we estimated a combined ∆SMD across meta- analyses and 95% CI across meta-analysis by a random-effects meta-analysis model (restricted maximum likelihood estimator). An ROR < 1 or ∆SMD < 0 indicated that split-mouth RCTs yielded larger intervention effect estimates than parallel-arm RCTs. Heterogeneity in RORs or ∆SMDs across the different meta-analyses was assessed by the I2 statistic and tau2 the between-meta-analyses variance. We plotted the results on forest plots. We reported the 95% prediction intervals for the ROR and ∆SMD, respectively, which provide a predicted range for the true difference in treatment effects between split-mouth and parallel-arm RCTs in an individual meta-analysis.

Analyses involved use of the R software (online at http://www.R-project.org, the R Foundation for Statistical Computing, Vienna,Austria). A 2-tailed P < 0.05 was considered statistically significant.

Results

Eligible systematic reviews and meta-analyses

The search yielded 335 potentially eligible articles. The flow chart of selection and reasons for exclusion are in Figure 1. We included 18 systematic reviews [1836]; 8 were Cochrane systematic reviews [23, 2527, 30, 33, 34, 36]. The selected reviews were all published recently (range 2006 to 2013). The reviews concerned interventions for periodontal disease (n = 9), dental surgery/implantology (n = 6), dental caries (n = 2), and orthodontic treatment (n = 1) (Table 1).

Figure 1
figure 1

Flow diagram.

Table 1 18 selected systematic reviews

From the 18 systematic reviews, 42 meta-analyses were eligible. The identification of overlapping meta-analyses led to the exclusion of 8 meta-analyses (from 2 systematic reviews). Consequently, 34 meta-analyses contributed to our analysis: 15 with binary outcome data, and 19 with continuous outcome data. The median number of RCTs per meta- analysis was 4 (range 2–16) (Table 2).

Table 2 35 selected meta-analyses (15 with binary outcomes and 19 with continuous outcomes)

Methods used for incorporating split-mouth trials into meta-analyses

In all systematic reviews, the authors combined split-mouth trials together with parallel-arm trials in meta-analyses (Table 3). For 6 of 18 systematic reviews, the authors also meta- analyzed split-mouth and parallel-arm trials separately in subgroup analyses. Regarding the standard error of the intervention effect estimate in split-mouth RCTs, in 8 reviews, how the paired nature of data was taken into account was not clear and in another 8 reviews, the paired nature of data was imputed with methods described by Follmann [13] when the appropriate data were not present in RCT reports. Finally, we contacted the authors of all split-mouth RCTs to ask for the matched outcome data and we received 16 responses.

Table 3 Methods used by review authors to incorporate split-mouth RCTs into meta-analyses

Characteristics of split-mouth and parallel-arm trials

The15 meta-analyses with binary outcome data involved 28 split-mouth and 28 parallel-arm RCTs; the19 meta-analyses with continuous outcome data involved 45 split-mouth and 48 parallel-arm RCTs, for 56 and 65 distinct split-mouth and parallel-arm RCTs, respectively. Parallel-arm RCTs were published later than split-mouth RCTs (median [25%-75% percentile] 2007 [2002–2008] versus 2004 [1999–2008]); the first published RCT was a split-mouth RCT in 20 of the 34 meta-analyses. Parallel-arm RCTs had a larger median sample size than split-mouth RCTs (median 40 [29–90] versus 20 [1230]).The median total relative weight of split-mouth RCTs in each meta-analysis was 51% [39–71%] for the 34 meta-analyses.

Differences in intervention effect between split-mouth and parallel-arm trials

Among the 15 meta-analyses with binary outcome data, 8 yielded a larger intervention effect for split-mouth RCTs (none with evidence for a difference between the two estimates) and 6 a larger intervention effect for parallel-arm RCTs (2 with evidence for a difference between the two estimates) (see Additional file 2). Split-mouth and parallel-arm RCTs did not differ in intervention effect estimates: the meta-epidemiological analysis yielded a combined ROR of 0.96 (95% CI 0.52–1.80, p = 0.91, I2 = 50%, 95% CI 9%–80%, and tau2 = 0.62 across meta-analyses) (Figure 2). The associated 95% prediction interval for the ROR was 0.19 to 5.08. Finally, when using random-effects models for within-meta-analysis calculations of summary effect sizes in split-mouth and parallel-arm RCTs, it yielded a combined ROR of 0.79 (95% CI 0.47–1.32, p = 0.36).

Figure 2
figure 2

Difference in intervention effect estimates between split-mouth and parallel-arm randomized controlled trials for binary outcome data.

Among the 19 meta-analyses with continuous outcome data, 8 yielded a larger intervention effect for split-mouth RCTs (2 with evidence for a difference between the two estimates) and 9 a larger intervention effect for parallel-arm RCTs (4 with evidence for a difference between the two estimates) (see Additional file 2). Split-mouth and parallel- arm RCTs did not differ in intervention effect estimates: the meta-epidemiological analysis yielded a combined ∆SMD of 0.08 (95% CI -0.14–0.30, p = 0.46, I2 = 56%, 95% CI 21%–82%, and tau2 = 0.12 across meta- analyses) (Figure 3). The associated 95% prediction interval for the ∆SMD was -0.63 to 0.79. Finally, when using random-effects models for within-meta-analysis calculations of summary effect sizes in split-mouth and parallel-arm RCTs, it yielded a combined ∆SMD of 0.05 (95% CI -0.21–0.30, p = 0.73).

Figure 3
figure 3

Difference in intervention effect estimates between split-mouth and parallel-arm randomized controlled trials for continuous outcome data.

In all, 6 of 8 meta-analyses showing differences between split-mouth and parallel-arm RCTs beyond chance did not meta-analyze split-mouth and parallel-arm RCTs separately in subgroup analyses.

Discussion

In our meta-epidemiological study, we found that split-mouth trials contributed half of the evidence in meta-analyses. Contrary to the recommendations by Lesaffre et al. and the Cochrane Oral Health group, most systematic reviews did not meta-analyze split-mouth and parallel-arm trials separately in subgroup analyses [37]. Moreover, most reviews did not report explicitly how split-mouth RCTs were handled in meta-analyses, while others approximated a paired analysis by imputing within-patient correlations. Finally, our meta- epidemiological study did not provide sufficient evidence for a systematic difference in intervention effect estimates between split-mouth and parallel-arm RCTs, both for continuous and binary outcome data.

The main difference between split-mouth and parallel-arm trials with regard to mechanisms of bias is that, in split-mouth trials, interventions may have effects on parts of the dentition other than those to which they were assigned; these carry-over effects put split-mouth trials at risk of bias. However, no method exists to assess or test the extent of carry-across effects in a split-mouth trial. As a consequence, the possibility of carry-over effects should be considered before deciding on whether a split-mouth design should be used. As far as we can judge a posteriori, the effects of interventions assessed in the reviews selected for our meta-epidemiological study were always localized.

Previous meta-epidemiological studies showed that individual study processes (eg, inadequate allocation concealment, non-blinding [38]) or nonprocess-related factors (eg, whether a study was conducted at a single center [39]) may put a randomized trial at risk of bias [40]. Very few meta-epidemiological studies have assessed if study designs itself could be associated with treatment effect estimates. Lathyris et al. focused on the cross-over design, which is relevant to oral health research and biomedical research in general; the results of crossover trials tended to agree with those of parallel-arm trials [41]. Here, we focused on the split-mouth design, which is popular in oral health research. This type of design is in fact also relevant to other fields of biomedical research in general, in which split- body studies allocate the interventions to separate parts of the body of each participant. However, these trials are infrequent (about 2-3% of randomized trials indexed in Pubmed) [42, 43] and we could find only one meta-analysis including at least one split-body trial and at least one parallel-arm trial [44].

Our findings are based on recently published systematic reviews covering a fair range of conditions and interventions in oral healthcare. Consequently, our results are more generalizable than could be obtained with focus on a particular topic. Our study has several limitations. We selected a relatively small number of systematic reviews for our meta-epidemiological study. It is difficult to identify reviews with both parallel-arm and split-mouth trials with usual strategies and we acknowledge that unidentified reviews may exist. However, we systematically searched for both Cochrane and non-Cochrane reviews, including a search of full-text articles indexed in the Cochrane library and in the Scirus database. Unfortunately, the latter service is no longer running. We acknowledge that searching additional regional databases (e.g., LILACS, PASCAL) and full-text databases (e.g., HighWire Press, Google Scholar) may be very useful to identify potentially eligible systematic reviews. Eligible reviews may be missing because of reporting bias (including location bias and language bias). However, reporting bias is usually driven by the magnitude/direction and statistical significance of treatment effects. We see no reason for reviews to be missing because of the difference in treatment effect estimates between split-mouth and parallel-arm RCTs. As a consequence, the impact of missing reviews is unpredictable and probably limited on our meta-epidemiological study. On top of the relatively small number of selected reviews, the number of split-mouth and parallel-arm RCTs in each meta-analysis was small. Meta-analyses typically include a limited number of trials: the median number of trials in a large sample of Cochrane meta- analyses was 3 [45]. The consequence is uncertainty in the difference between the 2 study designs. Because of these limitations, and as it is to our knowledge the first meta- epidemiological investigation on the subject, we acknowledge that these results should be replicated, by including additional comparisons between the two designs as they become available. A second caveat is that we did not assess risk of bias within each RCT and we cannot assess meta-confounding. The split-mouth and parallel-arm trials in the selected reviews may differ in their methodological quality. However, it would be difficult to assess the risk of bias in selected split-mouth trials because assessing internal validity requires adequate reporting and split-mouth trials frequently exhibit poor or inadequate reporting [37]. Moreover, meta-confounding because of systematic differences in risk of bias between split- mouth and parallel-arm trials would be an alternative explanation for an association between trial design and treatment effect estimates but we did not find evidence of such an association.

Our results support the use of all available evidence in systematic reviews, including that from split-mouth and parallel-arm RCTs, and authors should consider results from both trial designs in syntheses of oral health primary research. The incorporation of split-mouth RCTs should follow adequate methods [9, 12]; moreover, for each split-mouth RCT, the difference between groups rather than estimates per group must be used and the standard error of the intervention effect estimate should take the matched nature of data into account [13].

Because trials in this field are frequently small, one should not be confident that the true intervention effect lies closer to the effect estimates from parallel-arm or split-mouth trials. Even when combining split-mouth and parallel-arm RCTs in the same meta-analysis, consideration should be given to potential differences between the different types of trials in subgroup analyses, until there is more evidence that the two designs do not systematically differ. Meta- analysts should also consider issues of external validity because split-mouth trials include patients with symmetrical caries or lesions who could differ from other patients in terms of possibly poorer brushing and dietary behavior.

Conclusions

Our meta-epidemiological study did not provide sufficient evidence for a difference in intervention effect estimates derived from split-mouth and parallel-arm RCTs. Systematic review authors should consider including split-mouth RCTs in their meta-analyses with suitable and appropriate analysis.