Introduction

Oral mucositis (OM) is a painful condition, characterised by ulcers [1]. Rapid cell division in the oral tract makes mucosal cells particularly sensitive to damage by irradiation [2]. OM commonly occurs in head and neck cancer patients (HNCPs) who have had radiotherapy (RT). It can affect up to 100% of HNCPs [3], and it is therefore a significant problem for this group. Radiation-induced oral mucositis (RIOM) can have a detrimental effect on patients’ functioning and quality of life (QoL): The painful inflammation and ulceration may affect patients’ ability to eat, drink and talk [4]. It may cause nutritional deficiencies affecting patients’ energy which can cause weight loss [4]. If severe RIOM occurs, it can affect patients’ health outcomes due to missed radiotherapy treatments; in fact, RIOM is the most likely side effect of RT to the oral region, causing limited RT doses [5].

The model for OM pathogenesis includes five stages: firstly, direct cell damage to the DNA, followed by tissue damage to the submucosa and basal epithelium, leading to inflammation then ulceration of the tissue (where bacteria then cause even more inflammation) and healing as the final stage [6].

Grade 1 RIOM generally starts after approximately 2 weeks of RT, with grade 3 RIOM generally occurring after approximately 3 weeks. Commonly, RIOM peaks 2 weeks after treatment is completed and is resolved 8 weeks after that [7].

Effective interventions are essential to mitigate RIOM; improve patients’ functioning, QoL, and health outcomes; and limit weight loss.

National guidelines for oral care for patients at risk of OM were determined by two organisations in the United Kingdom (UK): the UK Oral Mucositis in Cancer Group (UKOMCG), updated in June 2019, and the Royal College of Surgeons of England and the British Society for Disability, updated in 2018. However, it is unclear how these organisations selected the studies on which they based their recommendations. Also, some of these studies were not contemporaneous. The search for contemporaneous studies in this review identified fifty-eight interventions for the management of RIOM in the last 5 years. For the majority of the interventions, there were few studies conducted with small sample sizes making it difficult to establish efficacy.

National Institute for Health and Clinical Excellence (NICE) guidelines, published in May 2018, recommended low-level laser therapy (LLLT) as an effective intervention for OM. However, the implementation of this intervention in a service may incur high set-up costs for equipment and training of staff. Therefore, this study focussed on examining the efficacy of low-cost interventions, which incur no set-up costs.

The aim of this study was to conduct a systematic review (SR) of contemporaneous studies to examine the efficacy of low-cost interventions to mitigate RIOM.

Methods

Study design

The study was designed to establish the efficacy of interventions to mitigate RIOM in HNCPs undergoing RT through a SR of contemporaneous evidence.

Eligibility criteria

Inclusion criteria

Studies that fitted the following criteria were included:

  • Randomised controlled trials (RCTs), SRs and meta-analyses (MAs)

  • Patients receiving RT, with or without chemotherapy (CT), for head and neck cancers

  • Interventions where there had been four or more studies conducted

  • Studies conducted in the last 5 years (from 2014 to 2019)

  • Studies in English language

  • Studies of adults

Exclusion criteria

Studies were excluded if they fitted the following criteria:

  • Studies where full text was not available

  • Studies where the interventions had added costs for equipment and training

Search strategy

The search for literature was conducted using the following databases: Amed, CINHL, Cochrane Library, EMBASE, EMCARE, Google Scholar, Medline via ovid and PubMed. The reference lists of the identified studies were also examined to find additional studies that fit the criteria that were not found through the database search.

Keywords used in the search were “Radiotherapy” or “radiation therapy” and “oral mucositis” or “mucositis”. The following Boolean operators were utilised: AND and OR. Figure 1 shows the search strategy adopted in this SR.

Fig. 1
figure 1

Flow chart showing the search strategy adopted in this review

Outcomes

The primary outcome measure was OM grade using any appropriate assessment scale, recorded in any format (for example incidence of severe OM, onset of OM, duration of OM) or OM pain (measured using a visual analogue scale or numerical rating scale). The following OM assessment tools were identified: World Health Organization (WHO) OM assessment tool; Radiation Therapy Oncology Group (RTOG) OM grading system; Oral Mucositis Assessment Scale (OMAS) and the National Cancer Institute Common Terminology Criteria for Adverse Events (NCI CTCAE). All secondary outcome measures were included.

Assessment of methodological quality and quality of evidence

The studies’ methodologies were appraised utilising Critical Appraisal Skills Programme (CASP) checklist for RCTs and SRs and recorded on Excel sheets. The quality of the evidence for all studies was assessed using Harbour and Miller’s (2001) Hierarchy of Evidence. The assessments were conducted by the author.

Data collection

Data from the studies was collected and recorded on standardised Excel forms by the author. The data extracted included author, year, title, aim of the study, study design, sample size, inclusion and exclusion criteria, randomisation method, intervention, control, details of cancer treatment, primary and secondary outcome measures, results and conclusion.

Risk of bias across studies

The author considered the risk of bias across the studies.

Translation of results into clinical guidelines

The findings of the review were applied to the GRADE Evidence to Decision (EtD) framework [8] to inform clinical guidance for the mitigation of RIOM.

Results

Study selection

Initially, the search identified 1508 studies. One thousand four hundred eighty-six studies were excluded because they did not meet the set criteria or were duplications. A search of reference lists identified two more studies. In total, twenty-four studies met the inclusion criteria.

Study characteristics

A summary of the studies’ characteristics and results are in Tables 1 and 2.

Table 1 Summary of the RCTs’ characteristics and results
Table 2 Summary of the SR and MA characteristics and results

The interventions identified, where there were at least four studies, were benzydamine hydrochloride mouth rinse (BHM; RCT, n = 4), honey (SR or MA, n = 6; RCT, n = 6) and oral glutamine (OG; SR or MA, n = 2; RCT, n = 6). A total sample size for each intervention was BHM (n = 311), honey (n = more than 3985) and OG (n = 924).

Five out of the 16 RCTs used a placebo as control. Other controls used were standard care (n = 4), saline (n = 3), povidone iodine rinse (n = 2), sodium bicarbonate (n = 1) and water (n = 1). The most commonly used OM assessment tool was the RTOG OM assessment tool (n = 10) followed by WHO OM assessment tool (n = 9) and NCI CTCAE (n = 9) then OMAS (n = 4). One study utilised a non-validated OM assessment tool; two RCTs utilised more than one validated OM assessment tool; and one study did not describe how OM was assessed.

OM was presented in twelve ways: incidence of severe OM (n = 18), onset of OM (n = 9), mean OM grade (n = 4), mean maximum OM grade (n = 4), duration of OM (n = 3), incidence of OM (n = 3), reduction of OM (n = 2), median OM (n = 1), OM recovery time (n = 1), number of OM lesions (n = 1), functional OM (n = 1), mucositis grade at 1 week (n = 1). Twenty-nine secondary outcomes were measured. The most common were pain (n = 10), weight loss (n = 8), treatment interruptions (n = 8), number of patients requiring feeding tubes (n = 5), number of patients requiring analgesia use (n = 4) and quality of life (n = 4). The least commonly utilised secondary outcomes measured by only one of the identified studies were number of patients requiring IV fluids, artificial saliva or anti-infection interventions; number of patients who developed dysphagia, nausea, cough or oedema; duration of opioid use; vital signs; blood counts; electrolytes; and renal function.

Synthesis of results

Benzydamine hydrochloride mouth rinse

Primary outcome measure (oral mucositis measures)

Two out of the four studies [9, 10] measured incidence of severe (grades 3 and 4) OM. Both found a statistically significant reduction in severe OM in the BHM group. One study [11] measured the mean OM grade and found a statistically significant reduction in the BHM group in weeks 4 to 7 of RT. One study [12] measured the median OM and found a statistically significant reduction in the BHM group. The same study measured the mean maximum OM grade and found a lower OM grade in the BHM group; the statistical significance was not calculated.

Secondary outcome measures

Three studies measured treatment interruptions; one study [10] found statistically significantly fewer treatment interruptions in the BHM group receiving RT alone but not in the group receiving chemoradiotherapy (CRT). One study [11] found fewer treatment interruptions but did not calculate statistical significance, and one study [12] found no statistically significant difference between the groups. Two studies measured the number of participants who required feeding tubes fitted. One study [10] found statistically significantly fewer participants in the BHM group, receiving RT alone, required feeding tubes fitted. They found no statistically significant difference between the groups receiving CRT. The second study [12] did not find a statistically significant difference between the groups.

Only one study [12] recorded adverse events (AEs) and found that 6.75% of participants in the BHM group were unable to tolerate the full strength of BHM due to a burning sensation in the mouth.

Quality of studies

A summary of the critical appraisal of the RCTs is in Table 3 and SRs and MAs in Table 4.

Table 3 Critical appraisal of RCTs using the Critical Appraisal Skills Programme
Table 4 Critical appraisal of SRs and MAs using the Critical Appraisal Skills Programme

The overall methodological quality of three out of the four studies examining the use of BHM to mitigate RIOM was low. Only one study [12] had moderate methodological quality.

Honey

Primary outcome measure (oral mucositis measures)

Nine out of the twelve studies measured the incidence of severe OM. Seven of these studies [13,14,15,16,17,18,19] found statistically significantly fewer patients in the honey group had severe OM; the other two studies [20, 21] found no statistically significant difference between the groups. Four studies [16,17,18, 21] measured onset of OM. All found onset of OM was delayed in the honey group although only the first three calculated statistical significance. Two studies measured the mean OM grade; one study [13] found a statistically significant lower mean grade of OM during the second 3 weeks of RT; the other study [17] found a lower mean OM score but did not calculate the statistical significance. Two studies measured the difference in OM grade between the intervention and control groups. One study [22], a SR, reviewed 17 studies and found a lower OM grade in the honey group in 12 out of the 17 studies; the other study [19] found no statistically significant difference between the groups. One study [23] measured incidence of OM and found a statistically significant lower incidence of OM over the course of treatment. One study [19] measured the number of OM lesions and found statistically significantly fewer OM lesions in the honey group.

Secondary outcome measures

Six studies measured pain. Three of these studies [17, 19, 24] found statistically significantly lower pain scores in the honey group; one SR [22] reported that four out of the five studies it reviewed found lower pain scores in the honey groups; two studies [13, 20] found no statistically significant difference in pain scores between the groups.

Six studies measured weight loss. Four studies [13, 16, 17, 21] found statistically significant less weight loss in the honey groups; one SR [22] found less weight loss in the honey groups in the studies it reviewed, and one study [20] found no statistical significant difference between the groups.

Three studies [16, 21, 22] measured RT interruptions. All found fewer RT interruptions in the honey groups; the former two studies had statistically significant findings, and the latter did not calculate statistical significance.

Four studies measured QoL; three studies [17, 19, 22] found higher QoL scores in the honey group but only one of those [17] calculated the statistical significance. The fourth study [20] found no statistically significant difference between the groups.

Three studies recorded AEs: one study [20] found most of the participants who dropped out of the study reported nausea, a strong taste of honey or burning in the mouth. One study [13] reported AEs but it is more likely these were related to the OM itself rather than the product. One study [18] reported that there were no AEs related to honey.

Quality of studies

Only two out of the twelve studies [18, 21] investigating the use of honey to mitigate RIOM had moderate methodological quality. The other ten had low methodological quality.

Oral glutamine

Primary outcome measure (oral mucositis measures)

Seven out of the eight studies examining OG [25,26,27,28,29,30,31] measured incidence of severe OM. All but one study [25] found statistically significantly fewer patients in the OG group had severe OM. Five studies measured onset of OM. Three of these studies [26, 29, 30] found a statistically significant delay in onset of OM in the OG group; the other two studies [27, 32] found no statistically significant difference between the groups. Three studies [25, 27, 29] measured maximum OM scores with all finding statistically significantly lower maximum OM score in the OG groups. Three studies measured duration of OM; one study [26] found a statistically significant shorter duration of OM in the OG group; one study [27] found no statistically significant difference, and one SR [29] reviewed two studies, one of which found a statistically significant difference and the other did not. One study [32] measured incidence of OM (grades 1 to 4) and found no statistically significant difference between the groups. One study [27] measured mean OM and found a statistically significant lower mean OM score in the OG group during weeks 5 and 6 of RT. One study [32] measured functional OM and found no statistically significant difference between the groups.

Secondary outcome measures

Four studies measured pain; two of those studies [27, 29] found a statistically significant reduction in pain in the OG group; one study [30] found fewer participants in the OG group experienced pain although the statistical significance was not reported; one study [32] found no statistically significant difference between the groups.

Three studies measured the number of participants requiring analgesics. One of those studies [27] found no statistically significant difference; one SR [29] reviewed a study which found no difference; and one study [30] found fewer participants in the OG group required analgesics, although the statistical significance was not reported.

Three studies measured weight loss. One SR [29] reviewed two studies, one of which found statistically significantly less weight loss in the OG group and the other did not; one study [32] found no statistically significant difference between the groups; and one study [31] found statistically significantly less weight loss in the OG group.

Three studies [29,30,31] measured the number of participants requiring feeding tubes. All found fewer patients in the OG group required feeding tubes fitted, although only the former two reported that the findings were statistically significant.

Four studies recorded AEs. Three studies [27, 29, 32] reported no AEs related to OG. One study [30] reported more AEs in the control group, but it was likely these were related to OM rather than the product.

Quality of studies

Two studies [25, 28] had moderate methodological quality. The other six studies had low methodological quality.

Risk of bias across studies

The author considered that the risk of bias across the studies was high due to heterogeneity.

Recommendations for clinical practice

The GRADE Evidence to Decision framework [8] was used to assess the evidence from this SR. A summary of the judgements and conclusions for interventions to mitigate RIOM in HNCPs are outlined in Tables 5 and 6.

Table 5 Evidence to Decision framework justifications
Table 6 Evidence to Decision framework

Although the findings in the studies examining BHM were mainly positive, the author cannot recommend BHM to mitigate RIOM due to the overall low methodological quality and poor tolerance of the product.

Eleven out of the twelve studies examining honey found it to be efficacious either in reducing the incidence of severe OM or mean OM grade, or delaying onset of OM. Additionally, of those eleven studies, two were of moderate methodological quality. However, one of the studies with moderate methodological quality [21] found honey to be efficacious at delaying onset of OM (and reducing RT interruptions and weight loss) but not at reducing OM severity. Therefore, the author can only recommend honey to reduce complications of RIOM, but not to mitigate it. Three out of the six RCTs included patients having moderate doses of RT, and so the author cannot recommend this intervention for patients having higher doses of RT (at least 64 Gy). Additionally, there is a potential risk of honey consumption in diabetic patients. Finally, the author cannot recommend Manuka honey due to the poor tolerance.

Seven out of the eight studies examining the use of OG to mitigate RIOM had favourable findings. Two studies were of moderate methodological quality, and there were no adverse effects recorded. So, the author can recommend OG to mitigate RIOM.

Discussion

This systematic review examined the efficacy of low-cost interventions to mitigate RIOM. The review identified interventions where there had been four or more studies examining it, conducted within the last 5 years. These interventions were BHM, honey and OG. The search identified twenty-four studies. The efficacies of the interventions were examined through the assessment of OM and secondary outcome measures. The review examined the interventions’ safety through the collection of data on adverse effects encountered. Following this, the evidence was applied to the GRADE EtD frameworks to inform clinical guidelines.

Recurrent themes that emerged included small sample sizes, most RCTs being single-centre studies, lack of blinding, heterogeneity, lack of data on AEs and lack of analysis of cost-effectiveness.

Most of the RCTs were small, single-centre studies, and even the two multi-centre studies had small sample sizes. Small samples are at risk of false-negative findings, and single-centre studies limit generalisability. Few of the studies examined in this review were blinded and those that were not risk bias. Also, in the blinded studies examining honey, the distinct taste and consistency of honey possibly increased the risk of performance bias. Identification of an effective placebo is necessary for well-conducted blinding and to reduce the risk of bias.

There was significant heterogeneity identified in the studies making it difficult to draw robust conclusions. Areas where heterogeneity was identified include OM assessment tools used, presentation of OM data, secondary outcome measures, doses and frequency of intake of the interventions, type of honey used, cancer treatments delivered (including patients receiving RT alone, CRT alone, or RT or CRT; type of RT machines; RT techniques—such as conventional or IMRT, and RT dose) and inclusion of certain cancer types. To reduce heterogeneity, the author recommends consensus of a methodology to be used in future studies.

Four OM assessment tools were identified. Although use of different OM assessment tools may introduce heterogeneity, one study [20] found good inter-reliability between RTOG, WHO and OMAS.

Overall, OM data was presented in twelve ways (for example data was presented as severity of OM, incidence of OM and onset of OM), and twenty-nine secondary outcomes were recorded, which introduced heterogeneity into the studies. The most common ways that OM data was presented was as incidence of severe OM and onset to OM; the most common secondary outcome measures used were pain, weight loss and RT interruptions. So, the author recommends that future studies present data in these ways and use the aforementioned secondary outcome measures.

The dose and frequency of consumption of the products varied which also introduced heterogeneity. The author recommends that studies examining the optimum dose be conducted. The type of honey used in the studies introduced further heterogeneity. One study [20] used Manuka honey, another [17] used thyme honey and the others used locally sourced, or pure, honey. Pooling data from studies using different types of honey may compromise the findings since some types of honey may be more effective at mitigating RIOM than others. One MA [21] found that the type of honey did not confound the findings; another MA [18] found local and pure natural honey efficacious and Manuka honey not efficacious at mitigating RIOM. An SR [22] reviewed thirteen studies which found conventional honey to be efficacious and four studies which found Manuka honey not to be efficacious.

Three BHM studies and three honey studies included participants having moderate doses of RT (between 50 and 64 Gy). It is likely that OM is less severe in patients having lower RT doses, and there is a possibility that including patients on lower doses makes the findings more favourable. Therefore, the findings can only be cautiously applied to patients having higher doses of RT.

There was additional heterogeneity due to inclusion of patients having different types of cancer treatment: either RT alone, CRT alone, or RT or CRT. Two studies [10, 13] found that BHM only mitigated RIOM in patients having RT alone, not in those having CRT; and two studies [18, 26] found the intervention efficacious for patients having either RT or CRT. Therefore, the author recommends future research examining the efficacy of the interventions for each cancer treatment option.

There was further heterogeneity in the types of radiotherapy delivered. Some studies included patients having treatment on cobalt machines, or conventional RT, where it is likely that RIOM is greater, due to larger margins required for the treatment field. Other studies included patients having intensity-modulated RT (IMRT) which treats smaller margins, and so RIOM is likely to be less severe. Two out of the six OG RCTs [25, 31] only included patients having IMRT. However, there were favourable results in the OG RCT [30] using conventional RT and in the OG MA [28], which had moderate methodological quality, which included patients having treatment using any type of RT technique. However, more research is needed to understand if RT techniques are confounding factors.

Although eight studies measured acute AEs [12, 13, 18, 20, 27, 29, 30, 32], none measured long-term AEs. One may assume that, due to the sugar content, prolonged consumption of honey can induce dental caries. However, a recent study [33] found that honey can prevent dental caries. The high sugar content makes honey unsuitable for long-term consumption by diabetic patients [34]. This contraindication was considered by four of the honey studies [13, 14, 16, 17] which excluded people with diabetes from participating, and another study [20] where participants were asked to monitor their blood sugar levels. However, excluding diabetic patients reduces generalisability of the findings. An RCT examining the use of parenteral alanyl-glutamine dipeptide, used as a supplement for autologous bone marrow transplant patients [35], found an increased mortality rate in the intervention group. However, a more recent SR and MA [36] reviewing glutamine supplementation for haematopoietic stem cell transplantation found no effect of either oral or IV glutamine on mortality rates. The author recommends future studies that examine the long-term AE of the interventions.

A significant limitation of the studies included in the SR was the quality of the methodologies. The methodologies of only five out of the twenty-four studies identified were of moderate quality. It is likely that the internal validity of low-quality studies may be compromised, and it is, therefore, difficult to draw robust conclusions. Therefore, it is recommended that future studies continue to improve the quality of the methodologies.

The focus of this review was to examine low-cost interventions to mitigate RIOM. Low-cost interventions were classed as those with few set-up costs. However, none of the studies examined the cost-effectiveness of the interventions, and so the author cannot make strong recommendations based on this. The author recommends future research in this area. The author acknowledges that there are financial barriers to producing high-quality research on low-cost interventions. Until more high-quality studies are available, the author recommends that clinicians consider the best available evidence-based interventions.

There were some limitations in the methodology of this SR. One limitation was that the search for studies was not comprehensive. The search only included studies in English language, those where the full text was available, and published studies. So, it is likely that selection and publication bias was present.

The author included interventions where there had been at least four studies conducted. Most studies examining RIOM have small sample sizes and are of low methodological quality and so risk false-negative findings. Including interventions with four or more studies reduces this risk, and so more robust conclusions could be drawn. However, including interventions where there had been fewer, good-quality studies may have been more appropriate. The author excluded studies conducted more than 5 years ago so that only the most up-to-date studies were included. However, selection bias could have been reduced by not limiting the search by year of publication.

Another limitation was that the SR only examined low-cost interventions. When making recommendations for clinical practice, the primary aim should be to find efficacious interventions over cost-saving ones. An alternative approach to research in this area could be to find cost-saving methods for already-established interventions for RIOM. For example, finding lower cost LLLT devices, finding ways to reduce training costs or having regional centres delivering LLLT (to reduce the number of devices needed and number of people trained to deliver the treatment).

A further limitation of this SR was that the research was conducted by one person, which may introduce bias. Finally, the SR only examined the efficacy of interventions to mitigate RIOM, and so conclusions cannot be applied to other causes of OM.

Conclusion

The author cannot recommend the BHM to mitigate RIOM due to the low quality of the studies and poor tolerance to the product. The author cannot recommend honey to mitigate RIOM but can recommend it to reduce complications of RIOM (for example weight loss, pain, RT interruptions) for patients on moderate doses of RT but not for diabetic patients. The author can recommend OG to mitigate RIOM. There is a need for high-quality studies with a consensus of the methodology to reduce heterogeneity and examination of the cost-effectiveness of the interventions.