From the 246 studies identified in our search, 93 duplicates were removed. A total of 153 studies had their titles and abstracts screened for eligibility. Of the resultant 153 citations, 119 were excluded on the basis of title and abstract alone, unrelated intervention or comparator (n = 76), incomplete study data (n = 30), or being an animal study (n = 13). The remaining 34 citations had their full-text reviewed. Of these, 25 were excluded because of irrelevant comparator (n = 2), not being an RCT (n = 20), and being a protocol without results (n = 3). Consequently, nine RCTs [13,14,15,16,17,18,19,20,21] involving a total of 1806 patients were found to be suitable for this meta-analysis. The search process is shown in Fig. 1. The authors of one of these studies reported potential conflicts of interest related to industry sponsorship .
Characteristics of Included Studies
Eight RCTs [13, 15,16,17,18,19,20,21] were multicentric, while one trial  was performed at a single center. Two studies [15, 17] belong to the subgroup analysis of the STOP! trial , while the data for two more studies [20, 21] were obtained from the MEDITA trial . Two trials [16, 17] involved adolescents while the remaining seven trials [13,14,15, 18,19,20,21] only involved adults. All trials [13,14,15,16,17,18,19,20,21] enrolled patients presenting with any trauma severity treated at a hospital ED. The majority of the included patients exhibited moderate-to-severe trauma, but three studies investigated patients with severe pain , moderate pain , and minor-to-moderate pain  (see Table 1 for more details regarding trauma severity from included trials). Specification of the methoxyflurane dosage was the same for all the included trials with the most common regimen being a Penthrox® inhaler (containing 3 mL methoxyflurane). However, the intervention methods of the control group varied among studies: four trials [15,16,17, 19] were normal saline controlled, while five trials [13, 14, 18, 20, 21] used different SAT modes. Specially, one study  added the SAT in both experimental and normal saline groups. Detailed characteristics of all the included studies are shown in Table 1.
Risk of Bias
The risk of bias assessment for all nine studies can be viewed in the supplemental material (eTable 2, eFig. 1A, B). The overall risk of bias was low in four double-blind RCTs [15,16,17, 19], and high in five open label trials [13, 14, 18, 20, 21] because participants or investigators were not blinded. All included trials exhibited prespecified outcomes. Randomization sequences and concealed allocation were adequately generated in all studies. Three trials [18, 20, 21] had an unclear risk of other bias that was attributed to a lack of sufficient methodological reports.
Seven trials [13, 15,16,17,18, 20, 21] reported on the change in pain intensity score within 30 min after the start of treatment. Pain intensity change was reported using a 100-point VAS (six studies, n = 1028) [15,16,17,18, 20, 21] or NRS (1 study, n = 305) , which was converted to a 10-point scale. One of the seven trials was not included in the meta-analysis because we were unable obtain the original data from graphs . Changes in pain intensities were analyzed several times from baseline to 3 min (three studies, n = 668) [13, 18, 21], 5 min (six studies, n = 1264) [13, 15,16,17,18, 21], 10 min (six studies, n = 1264) [13, 15,16,17,18, 21], 15 min (six studies, n = 1264) [13, 15,16,17,18, 21], 20 min (six studies, n = 1264) [13, 15,16,17,18, 21], 25 min (two studies, n = 363) [18, 21], and 30 min (two studies, n = 363) [18, 21] to assess the pooled effects.
Participants administered methoxyflurane exhibited larger mean reduction from baseline in pain intensity score than the control group at 3 min (WMD − 0.44 cm; 99% CI − 0.64, − 0.23; p < 0.00001; I2 = 0%), 5 min (WMD − 0.93 cm; 99% CI − 1.14, − 0.71; p < 0.00001; I2 = 28%), 10 min (WMD − 1.11 cm; 99% CI − 1.56, − 0.66; p < 0.00001; I2 = 65%), 15 min (WMD − 1.23 cm; 99% CI − 1.99, − 0.47; p < 0.0001; I2 = 85%), and 20 min (WMD − 1.12 cm; 99% CI − 1.75, − 0.49; p < 0.00001; I2 = 75%). Changes in pain intensity did not differ between groups at 25 min (WMD − 0.36 cm; 99% CI − 0.85, 0.31; p = 0.06; I2 = 3%) and 30 min (WMD − 0.39 cm; 99% CI − 0.97, 0.19; p = 0.08; I2 = 0%) (Figs. 2, 3a, b). These differences failed to meet the threshold for clinical significance (a 1.3- to 1.5-cm change at any single time point).
Secondary Pain-Related Outcomes
Time from Start of Treatment to First Pain Relief
Seven studies [13, 15, 16, 18, 20, 21, 25] reported the time of first pain relief, and one of these seven trials was not included in the meta-analysis because of insufficient information to allow for analysis . Compared to the control group, methoxyflurane was shown to significantly shorten the time to first pain relief by an average of 5.29 min (95% CI − 6.97 min to − 3.62 min; p < 0.00001; I2 = 100%) (Fig. 4a).
Proportion of Patients Experiencing Pain Relief
The proportion of pain relief was assessed in six studies [13, 15, 17,18,19,20]. Patients in the methoxyflurane group exhibited higher pain relief rates than the control group (RR 1.41; 95% CI 1.17–1.70; p = 0.0003; I2 = 85%) (Fig. 4b).
Proportion of Patients Administered Rescue Analgesic Medication
Seven studies [13, 15,16,17,18, 20, 21] reported the proportions of patients administered the rescue analgesic medication before discharge, and all studies had complete data to allow statistical analysis. Compared to the control group, inhaled methoxyflurane significantly reduced the proportion of patients administered the rescue analgesic medication (RR 0.32; 95% CI 0.21–0.49; p < 0.00001; I2 = 38%) (Fig. 4c).
Other Secondary Outcomes
Proportions of Patients, Physicians, or Nurses Who Rated Satisfaction as Excellent, Very Good, or Good
For therapeutic satisfaction, eight studies [13, 15,16,17,18,19,20,21], seven studies [15,16,17,18,19,20,21], and four studies [15,16,17, 19] reported the proportion of patients, physicians, and nurses that rated it as excellent, very good, or good, respectively. One study  was excluded because its data were not available in an extractable format. Compared to the control group, the overall efficacy of methoxyflurane was rated excellent, very good, or good by significantly more patients in the methoxyflurane group (RR 1.31; 95% CI 1.07–1.60; p = 0.009; I2 = 86%). Moreover, significantly more physicians (RR 1.50; 95% CI 1.29–1.74; p < 0.00001; I2 = 58%) and nurses (RR 1.89; 95% CI 1.37–2.62; p = 0.0001; I2 = 80%) rated the practicality of using methoxyflurane as excellent, very good, or good when compared to the control (Fig. 5).
Eight studies [13,14,15,16,17,18,19, 21] evaluated the total incidence of TEAEs, but one of these trials was not included because raw data were not presented in the manuscript . In the meantime, the most common adverse events following methoxyflurane administration, namely, dizziness, somnolence, feeling drunk, and headache, were reported by seven (1386 patients) [13, 15, 17,18,19,20,21], six (1295 patients) [13, 15, 17,18,19,20], four (809 patients) [17,18,19, 21], and four (921 patients) [15, 17,18,19] of the included trials, respectively.
The risk of total TEAEs after methoxyflurane administration increased three-fold (RR 3.09; 95% CI 1.72–5.57; I2 = 87%), and there was a statistically significant difference when compared to the control (p = 0.0002) (eFig. 2A). Pooled results showed a significantly elevated risk of dizziness (RR 4.12; 95% CI 2.69–6.29; I2 = 0%; p < 0.00001), somnolence (RR 3.60; 95% CI 1.84–7.07; I2 = 0%; p = 0.0002), and feeling drunk (RR 5.43; 95% CI 2.21–13.89; I2 = 0%; p = 0.0004) in patients administered methoxyflurane, when compared to those not administered methoxyflurane. However, we found similar rates of headache (RR 1.26 [95% CI 0.81–1.95]; risk difference, 2% [95% CI − 2 to 5%]; I2 = 0%) between those receiving and those not receiving methoxyflurane (eFig. 2B). Pooled results of TEAEs grouped by system organ (p > 0.05) and remaining occasional symptoms (p > 0.05) are presented in supplementary material eFigs. 3 and 4.
Sensitivity and Subgroup Analysis
Neither of the sensitivity analyses, one excluding the studies with a high risk of bias and the other excluding the results of the PenASAP trial , changed the robustness of estimates or conclusions for any outcomes of high heterogeneity. However, results of the sensitivity analysis suggest that the open-label studies were probably identified as the most dominant sources of heterogeneity for this research (supplementary material eTables 3A and 4A).
Our pre-specified subgroup analysis indicated that heterogeneity may be attributed to the age subgroup and the control group intervention methods. A pre-specified subgroup analysis comparing the change in pain intensity score at 15 and 20 min after the start of treatment on inhaled normal saline and SAT found the intervention of the control group to be an effect modifier. When excluding the data for normal saline group, subgroup analysis changed the significant overall effect of pain reduction to insignificant at 15 min [13, 18, 21] (WMD − 0.73; 99% CI − 1.78 to 0.32; p = 0.07) and 20 min [13, 18, 21] (WMD − 0.77; 99% CI − 1.60 to 0.07; p = 0.02 > 0.01) (Fig. 3c, d); in other words, there was no significant improvement in pain intensity at 15 min and 20 min for the group administered methoxyflurane when compared to SAT (supplementary material eTable 3B). Subsequent subgroup analysis for secondary pain-related outcomes showed that the superiority of methoxyflurane was further narrowed when compared to SAT (the p values were close to the threshold). The different onset time of the two analgesia methods might explain why the overall analgesic benefit of methoxyflurane is not superior to SAT (WMD − 6.00 min; 99% CI − 6.10 to − 5.90; p < 0.0001) [13, 18, 20, 21]. Interestingly, adolescents were shown to exhibit better tolerance to methoxyflurane when compared to adults because of higher satisfactory level (RR 1.40 vs RR 1.29) and lower TEAEs incidence (RR 1.39 vs RR 4.45) was observed according to the results of another pre-defined subgroup analysis (supplementary material eTable 4B).
Quality of Evidence
Table 2 shows the summary of findings for all outcomes including the certainty of evidence. The quality of evidence was downgraded by the high risk of bias, serious inconsistency with high heterogeneity (I2 > 50%), or serious imprecision with wide CI in the results. The quality of evidence was low or very low in most outcomes. The outcome of nurses’ satisfaction assessment was the only one that exhibited a moderate quality of evidence.