Introduction

Oropharyngeal dysphagia (OPD) refers to the difficulty or inability to move a bolus safely and effectively from the oral cavity to the esophagus, which can lead to related clinical complications, such as malnutrition, dehydration, and severe complications, such as aspiration pneumonia, suffocation, and eventually, premature death. As reported, the global prevalence of OPD was estimated to be 43.8% [1]. Furthermore, the prevalence of OPD is higher with predisposing conditions, such as stroke, Parkinson’s disease and pneumonia [1,2,3]. Additionally, OPD impacts the quality of life and psychological well-being of patients. Social activities and daily routines are disrupted, resulting in isolation and social exclusion [4].

Patients with OPD represent a large population, and OPD is a daily problem for patients, seriously threatening the average survival time after the deterioration symptoms [5, 6]. Thus, it is highly valued by health workers and their families. Therapeutic approaches to improve the safety and efficiency of swallowing can be divided into compensatory behavioral strategies, dietary modifications, and rehabilitation exercises, most of which are direct interventions for swallowing [7]. Currently, dietary changes and the use of thickeners are the standard treatments, but they are quite expensive, and this practice may reduce the quality of life of patients and does not promote their recovery [8,9,10]. Therefore, finding a better way to improve the symptoms of patients with OPD has become an urgent global health issue.

Voice training (or vocal exercises) consists of intensive phonation exercises that act under the intrinsic and extrinsic laryngeal muscles to improve the coordination between myoelectric and aerodynamic larynx forces, so that each articulatory element reaches its optimal state and maintains optimal function for an extended period [11, 12]. It was identified that the organs associated with swallowing and speech are structurally and neurologically linked [10]. Through appropriate voice training to promote the closure of the vocal cords while driving the coordinated contraction of cervical swallowing muscles, in addition to strengthen the strength and mobility of the pharyngeal muscles [12], voice training can further stimulate some regional networks of the cerebral cortex. These networks include the auxiliary motor area and anterior cingulate area, which seem to be associated with swallowing movement [13], and thus, their innervation promotes swallowing recovery. Appropriate voice training intervention can further help promote the self-management ability of patients [14]. In short, voice training is a convenient, inexpensive, and highly beneficial method for patients with OPD.

Overall, the present study reveals that improving voice function has a positive impact upon swallowing function. Nevertheless, the physiological mechanism by which voice training results in improved swallowing function in patients with OPD have not been clarified, and there are no reviews that have analyzed this phenomenon in depth. Therefore, we attempted to gain more insight into the changes that occur in swallowing function after voice training. The accepted gold standard for swallowing function assessment is still videofluoroscopy. However, there are many other measures of swallowing function [15], such as the Functional Oral Intake Scale and electromyography, which are also widely used and are reliable measurements of swallowing ability. These additional assessment types remain to be comprehensively summarized. Thus, our research questions were:

  1. 1.

    Which Voice Training Intervention protocols have been used to target improved OPD?

  2. 2.

    Which measures of swallowing physiology have been reported?

  3. 3.

    What additional measures have been used to capture the impact of voice training intervention protocols on swallowing?

  4. 4.

    What are the reported results of voice training intervention protocols?

Methods

The Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement [16] was used to guide the development and methodology in the present systematic review.

Search strategy

We performed a systematic review following the PRISMA guidelines [17]. The articles were selected from the databases PubMed, Embase, Cochrane library, Web of Science, and CINAHL in April 2022 using the following descriptors in English: “voice training,” “voice therapy,” “vocal exercises,” “Voice Treatment,” “singing,” “Whistling,” “Rhythmic Vocalizations,” “Chant,” “dysphagia,” “deglutition,” “swallowing,” and “Oropharyngeal Dysphagia.”

Inclusion and exclusion criteria

Detailed inclusion and exclusion criteria are outlined in Table 1. We did not limits studies by date; however, we restricted our search to English language publications only. The review was restricted to randomized controlled trials (RTCs), controlled studies, case–control studies, cohort studies, and case series designs. Single case studies were excluded from this review, as we aimed to examine articles using representative samples of a reasonable size. Moreover, editorials and narratives were excluded due to their lack of prospective intervention design.

Table 1 Inclusion and exclusion criteria

Study selection

As shown in Fig. 1, the original search yielded a total of 1634 records, of which 233 were duplicates. After the removal of these duplicates, two reviewers independently assessed the titles and abstracts of all retrieved records and determined their eligibility for inclusion. Disagreements were settled by consensus and, if needed, by a third reviewer. All reviewers received systematic and relevant training and ensured that they had extensive background knowledge relevant to the research.

Fig. 1
figure 1

PRISMA flow diagram depicting the different phases of the systematic review, mapping out the number of records identified, included and excluded

Risk of bias assessment

A methodological quality assessment of each individual study was completed independently by each reviewer to evaluate the validity of the study design and reporting methods. Risk of bias evaluation was completed using the Cochrane Collaboration’s Tool for Assessing Risk of Bias [18]. The criteria assessed were selection bias (random sequence generation, allocation concealment), performance bias (blinding of participants and personnel), detection bias (blinding of outcome assessment), attrition bias (incomplete outcome data), and reporting bias (selective reporting).

Each item on the Cochrane Collaboration’s Tool for Assessing Risk of Bias was scored with a “Y” for yes if susceptible to bias in that category, an “N” for no if not susceptible to bias in that category, or a “U” for unsure/other if raters could not determine appropriate scores, if the criteria were not applicable, or if this was not reported for that particular category.

Data extraction process

Data extraction was completed independently by a single rater for full articles that met all inclusion criteria outlined above. Data extraction included the following: (1) study design; (2) patient population descriptions (age, sex, etiology); (3) sample size; (4) proportion of male and female participants; (5) interventions details; and (6) outcome measure. Results from each study were extracted and always included statistical analyses of changes in swallowing function after the intervention.

Results

Literature retrieval

Figure 1 provides an overview of the selection process for included studies. Of the 1401 studies identified for preliminary screening of titles and abstracts, 1358 were rejected after failing to meet inclusion criteria. After an initial screening of the studies that were considered potentially relevant (43 articles), a full-text reading was carried out, paying special attention to the study design, the intervention (treatment type and outcome evaluation indicators), and other factors. Twelve full-text articles were excluded, as they did not mention voice training interventions for OPD; 11 articles were non-experimental studies; 2 articles were not available; 2 articles were published in English; data could not be extracted from 3 articles, while five studies had a lack of available data. Overall, eight articles met this review’s objective and inclusion criteria [6, 19,20,21,22,23,24,25].

Quality assessment

Figure 2 summarizes the quality assessment of all included studies using the Cochrane Risk of Bias Tool. Selection bias was identified in one study [6], where participants were enrolled either via convenience sampling or by consecutive recruitment without randomization. Performance bias was identified in two studies. One study [19] was deemed to have a high risk of performance bias as the blindness of study participants was not strictly implemented to the grouping information that was disclosed. One study did not report whether participants and staff were blinded, so it is not certain whether outcomes would have been affected [6]. For two studies, we could not be certain of the examination bias, because there was no mention of blinds in either of the studies by raters of any outcome measures [6, 20]. Two studies were deemed to have a high risk of attrition bias. Participants do not complete the full intervention in one [21] study, and the it is not explained how the interruption data was managed in another study [22]. Reporting bias was deemed to be highly likely in one study [21], which did not report the outcome of a study indicator. Finally, additional biases were identified in one study [22], wherein baseline data varied significantly.

Fig. 2
figure 2

Cochrane tool for risk of bias

Patient characteristics

Patient characteristics are reported in Table 2. Eight different patient population groups were included: Parkinson (idiopathic Parkinson) [6, 23, 24], progressive supranuclear palsy [25], head and neck cancer [19], post-orotracheal intubation [22], multiple system atrophy [20], and stroke [21]. Sample sizes varied widely across studies, ranging from 7 participants [25] to 32 participants [22]. Studies included both male and female participants; however, most studies included a larger proportion of males compared to females. The participants were generally older.

Table 2 Patient characteristics

Question 1: training protocols

Voice training interventions included Lee Silverman voice treatment (LSVT), therapeutic singing, and vocal exercises. At present, voice training interventions for patients with OPD focus on LSVT, which was used in 50% of the studies [20, 23,24,25]. LSVT practices include maximum duration of sustained vowel phonation, maximum fundamental frequency range, and maximum functional speech loudness drill [24]. Therapeutic singing consisted of physical preparation, vocalization for warm-up, singing exercises for laryngeal elevation, and modified singing of approximately 20 min in duration [19]. Vocal exercises include intensive phonation exercises and laryngeal raising and lowering exercises [21]. The most stable duration for LSVT treatment is 4 weeks. The shortest duration of voice training was no more than 10 days. Therapeutic singing practice had a maximum duration of 8 weeks. The duration of each exercise session for the patients in these studies was divided into two types, one of 20–30 min per session and the other of 50–60 min per session; meanwhile, the frequency of weekly voice therapy used in the studies examined spanned a wide range. Exercises were completed with direct clinical guidance from professional language professors or therapists (see Table 3).

Table 3 Training protocols

Question 2: physiological measurements of swallowing

Each study, along with its inclusion criteria, study design, and a list of the outcome measures collected, is presented in Table 4. Of these studies, three used VFSS as the measurement tool. In the literature, fiber-optic endoscopy and/or videofluoroscopy of swallowing are used as the gold standard [26, 27]. All studies use temporal measurements to reflect changes in swallowing.

Table 4 Swallowing measures

Two studies used the Videofluoroscopic Dysphagia Scale (VDS) as the tool of measurement [19, 20]. VDS has 14 items with a total score of 100, which are divided into the oral phase (7 items, 40 points) and the pharyngeal phase (7 items, 60 points). The higher the score, the more severe the swallowing difficulty [20]. One of the studies did not report scores for each parameter but instead reported total scores for all parameters [20].

One study [6] used electromyography (EMG) to measure swallowing in patients. In previous studies, EMG has been used to describe the swallowing function of patients and is reported to be effective in identifying differences between patients with swallowing disorders and healthy patients [28]. The indicators of swallowing in EMG are Peak amplitude, the area under the curve (AUC), time to peak amplitude, rise time, fall time, and duration [28].

Question 3: additional measures

Additional measures used to determine the effects of voice training included the Speech Handicap Index-15 (SHI-15), which assesses the effects of patient speech on daily life [29], and the Eating Assessment Tool-10 (EAT-10), which documents the initial dysphagia severity and monitors the treatment response in persons with a wide array of swallowing disorders [30]. Moreover, the Quality of Life in Swallowing Disorders Questionnaire (SWAL-QOL) is used to quantify changes in swallowing related quality of life [31], and the Unified Parkinson’s Disease Rating Scale (UPDRS) is employed to measure the patient's neck stiffness and the effect of swallowing therapy (the patient’s neck stiffness may affect swallowing and electromyography measurements) [24].

In addition, two studies [21, 22] did not use fibroendoscopy or video fluoroscopy to evaluate changes in swallowing before and after treatment, instead opting to use only qualitative scales. Both such studies used the Functional Oral Intake Scale (FOIS), which has been described as an important and reliable tool for assessing oral intake progression [32], to assess changes in functional oral intake [33]. One study used the Protocolo de Avaliação do Risco para Disfagia (PARD) method to characterize clinical signs that are suggestive of laryngeal penetration or aspiration and the severity of dysphagia [34].

Question 4: voice training intervention outcomes

VFSS measurements: The most commonly collected outcome measures of swallowing physiology to determine changes pre- and post-treatment were temporal measurements [35]. Statistically significant changes in swallowing timing measurements were identified in two [23, 25] studies. Both studies revealed an increase in esophageal sphincter opening time in patients by LSVT. Significant changes in OTT and OPSE were identified in a second study [24].

VDS measurements: Statistically significant changes in pharyngeal phase scores were identified in two [19, 20] studies. In one of the articles [20], an increase in total VDS score was also identified as being statistically significant; more notably, changes in the pharyngeal stage of VDS continued into the follow-up period in this study.

EMG measurements: The results of one preliminary study [6] revealed that EMG time measurements of the laryngeal and subchinnabular muscle groups during swallowing after therapeutic singing were significantly increased in patients with Parkinson’s disease. Moreover, analysis of EMG results revealed no relationship between the swallowing ability of patients and the rate of weekly treatment reviews.

FOIS measurements: Both studies [21, 22] using FOIS demonstrated statistically significant improvements in swallowing before and after treatment.

Other measurements: The results of these measures all revealed improvements in swallowing function in different ways. Detailed information regarding all reported outcomes is provided in Table 5.

Table 5 Summary of outcome measures and results

Discussion

Methodological comments

The review presented herein identified mixed evidence to reveal whether voice training intervention specifically impacts swallowing function. Overall, voice training can improve swallowing in patients with neurological dysphagia, such as stroke, and in patients with non-neurological dysphagia, such as head and neck cancer. For patients with swallowing disorders who suffer from expensive treatments, complex therapies and additional treatment time [36], voice training to improve swallowing function can certainly help alleviate these burdens.

Nevertheless, all studies provided information regarding the short-term effects of the treatment, while minimal data were reported regarding long-term effects. Only two studies conducted follow-ups; however, the longest follow-up period did not exceed 6 months. Therefore, it was difficult to judge the long-term effects of voice training intervention on swallowing function. In addition, 50% of reviewed study protocols did not include a control group, which may result in the misinterpretation of data confounded by the natural recovery over time of oropharyngeal swallowing function in dysphagia patients.

In the field of voice therapy, quality-of-life assessment is already established as an important evaluation technique [37, 38]. However, in this systematic review, this issue was identified as being regularly ignored when describing therapy outcomes. Only one of the final eight studies highlighted quality-of-life issues related to OPD. It is hoped that future studies will value QOL assessments.

Moreover, the evaluation of treatment outcomes in the reviewed studies was broadly limited to a small sample size measured by a small number of speech therapists. In addition, there was large heterogeneity in the patient populations selected in the included articles. Two studies [21, 22] were limited to the use of self-assessment tools only and did not use instrumental evaluations of swallowing to determine the impact of voice interventions on dysphagia. Both of these limitations significantly limit both our understanding in this area and the interpretation of evidence currently available. There are a wide variety of heterogenous methods available for the measurement of swallowing function. It is, however, recommended that objective swallowing measurement tools are utilized, such as VFSS, EMG, etc.; these methods should enable proper assessment of a patient's swallowing function.

Finally, the current voice intervention therapies are relatively simple and applied general; more individualized therapies have not yet been developed for the distinct stages of dysphagia or associated muscle groups. Although, surface EMG can be used to quantify the muscle strength associated with swallowing and also to monitor the status of different muscle groups during pronunciation [15]. A more precise and stratified intervention plan can be constructed by assessing the electromyographic activity of the swallowing surface muscles through the surface EMG technique, which would be beneficial for the rehabilitation of swallowing function in patients.

Therapy effect comments

In the included articles, patients exhibited improvement in both the oral and pharyngeal phases of dysphagia. During the oral phases, voice training intervention was effective in improving tongue strength, enabling it to better control the bolus and ameliorate premature bolus loss [20, 24]. Improvement in oral transit time (OTT) and premature bolus loss in patients could indicate improved tongue motor function during the oral phases (Table 5). During the pharyngeal phases, vocal training intervention was effective in modifying glottal closure and laryngeal elevation, and these mechanisms were observed to be associated with airway protection. The improvement of maximal opening of the pharyngoesophageal sphincter (PESmax) by VFSS, the improvement of the rising and falling times of laryngeal muscle groups by EMG, in addition to the improvement of VDS pharyngeal phase score, FOIS score, and DIGEST score [6, 19, 21, 22, 22, 23], all indicate improvements in laryngeal muscle swallowing function in patients (Table 5).

Voice training intervention can improve maximum phonation time (MPT) in dysphagia patients [20]. MPT was correlated with oropharyngeal motor functions, such as tongue movement (bolus formation, oral transit time), laryngeal elevation, and pharyngeal swallow triggering [39]. Thus, MPT status may contribute to the mechanisms by which voluntary swallowing is improved.

Because the voice training intervention enhanced swallowing protocol was designed to make patients sing at different pitches, this particular therapy facilitated an increase in the width of the upper esophageal sphincter. Therefore, the therapy resulted in increased the extent and duration of laryngeal elevation [19]. Maintaining elevation of the laryngeal complex and thus the hyoid bone allows the esophageal port to stay open longer, facilitating increased time for the bolus to clear and reducing the chance of aspiration [40]. Voice intervention training is capable of altering the neurophysiologic mechanisms responsible for the upper digestive system, with orofacial myofunctional adjustment aiding the elimination of laryngotracheal aspiration risks [22]. In addition, vocal exercises enable a significant increase in the vibration and movement of the laryngeal and pharyngeal structures, primarily focused at the vocal folds, which leads to an increase in the amplitude of vibrations at the mucous membrane. This increase in vibration facilitates the activity of the inward process, which subsequently improves vocal glottal closure function [21]. In conclusion, both the promotion of laryngeal elevation and vocal fold closure function demonstrates that voice intervention training can improve swallowing function in OPD patients.

Furthermore, some studies identified that in addition to the primary sensorimotor cortex (pharynx–larynx representation) and brainstem, the additional region most strongly activated during voluntary swallowing was the right anterior insular cortex [41, 42]; this region is one of the sites associated with significant change after voice training intervention [43]. Therefore, the right anterior insular cortex may further contribute to the mechanism of improved voluntary swallowing in dysphagia patients [24].

Limitations

Limited study design, poor methodological quality and the small sample size of included studies may limit the conclusions drawn from our analyses. There may be eligible studies archived in databases and search algorithms that we did not use for the literature search and thus were not identified. Finally, as limited translational resources were available, only English studies were included in this review. However, despite this limitation, only two non-English studies were excluded at the full-text screening stage.

Conclusions

Overall, this systematic review described the effects of voice training interventions on swallowing function. Voice training improves the oral and pharyngeal stages of swallowing in patients with neurological causes of dysphagia, such as stroke, and in patients with non-neurological causes of dysphagia, such as head and neck cancer. However, the number of included studies was small and they are diverse in terms of assessment tools and cannot be quantitatively analyzed. Therefore, further preliminary studies are now needed to provide more evidence to support voice training intervention in dysphagia.

Currently, voice interventions are available for people with a variety of underlying conditions that cause dysphagia. Future studies should endeavor to further increase the number of patients included, expand the coverage of the treatment population, assess the long-term effects of voice training interventions, increase the evaluation of the improvement in quality of life of patients after swallowing, and provide stronger evidence to deconvolute the effect of speech training on improving swallowing function. Future research should further attempt to refine and stratify the content of speech training with the support of clinical practice to facilitate more rapid swallowing function amelioration in patients.