Oropharyngeal dysphagia (OD) occurs in 2.3–16% in different populations [1,2,3,4]. Moreover, the prevalence of OD increases with age [1]. Actually, OD is presented in 33–40% of the above-60-year-old population and in patients above 75 years in 50% of population; nevertheless in group of elderly community-acquired pneumonia patients, the percentage is higher (91%) [5, 6]. Swallowing disorders are therefore not a marginal problem. We live in an aging society that lives longer and, therefore, is more likely to experience swallowing disorders. The most common causes of swallowing disorders after the age of 60 are neurological diseases (e.g., 8.1–80% of stroke patients, 11–60% of Parkinson’s diseases patients) and head and neck cancers (e.g., 52% patients after radiotherapy, 69% after chemoradiotherapy, 10–72% patients after total laryngectomy, 51% with head and neck cancer patients) [1, 3, 6,7,8]. The effects of OD range from malnutrition and dehydration, to aspiration pneumonia and asphyxia. These complications can have a significant impact, not only on the health and lives of patients, but also on their quality of life.

Recently, quality of life (QoL) has come to play a key role in treating patients with dysphagia. WHO defines quality of life as “individual perception of their position in life in the context of the culture and value systems in which they live and in relation to their goals, expectations, standards, and concerns” [9]. Furthermore, recently, the self-assessment through patient-centered measures has gained increasing importance, since it may not correlate with clinician-driven tools [10]. Many questionnaires assessing the QoL of patients with dysphagia have been developed. One of these is M.D. Anderson’s Dysphagia Inventory, a questionnaire for Head and Neck Cancer (HNC) patients [11]. However, the Dysphagia Goal Handicap (DGH) was designed for patients with esophageal dysphagia [12]. One of the more general questionnaire is the Swallowing Quality of Life Questionnaire (SWAL-QOL), designed for adult OD patients with etiological heterogeneity [13]. Finally, the Eating Assessment Tool-10 (EAT-10) aims to assess OD-related functional health status [14]. However, for the time being, neither of them has been validated for the Polish language [15]. The Dysphagia Handicap Index (DHI) and the SWAL-QOL are questionnaires for oropharyngeal dysphagia with the strongest psychometric ratings. The SWAL-QOL is considered the gold standard, exhibits good internal consistency reliability and short-term reproducibility, but is longer than the DHI and could be difficult for some groups of patients to fill out [13, 16, 17]. Therefore, the authors chose the DHI Questionnaire to translate and validate because the form is more concise and user-friendly than the SWAL-QOL [18].

The DHI is a self-assessment questionnaire on quality of life as it pertains to and results from the ability to swallow. It consists of 25 statements to which patients match answers according to the three-stage Likert scale, where 0 means "never," 2 means "sometimes," and 4 means "always". These statements examine three aspects of the swallowing disability as it pertains to functional aspects (9 self-assessment questions, numbers: 6, 7, 9, 10, 14, 15, 16, 22, 23), physical condition (9 self-assessment questions, numbers: 1, 2, 3, 4, 5, 11, 20, 24, 25), and emotional state (7 self-assessment questions, numbers: 8, 12, 13, 17, 18, 19, 21). A patient can get a maximum score of 100 points. A score of 0 means the patient is completely satisfied with their ability to swallow. The higher the DHI value, the greater the patient's dissatisfaction with the swallowing quality/ability. Sobol et al. suggest that the normative value of DHI score for a healthy participant is 4, while a score above 4 indicates self-perceived dysphagia symptoms [19]. A visual analogue scale (VAS) is added to the DHI questionnaire. This is a simple scale, commonly used in medicine, to assess the severity of disorders experienced by patients. In the study, the VAS scale was used to help determine the severity of swallowing problems being experienced by the patient. It takes values from 1 to 7. The numbers are graphically represented on a ruler, where 1 is normal swallowing, 4 is a moderate swallowing problem, and 7 is a severe swallowing problem.

DHI is becoming more prevalent, and it has been translated into Hebrew [18], Persian [20], Arabic [21], Japanese [22], and Kannada [23]. It is highly problematic that no questionnaire for examining swallowing-related quality of life has been adapted and validated in Polish language. Since there is no Polish version of the DHI questionnaire (PL-DHI), this study undertakes to translate and adapt it for Polish-speaking population and culture.


Approval was obtained from the Polish Local Ethics Committee of the Medical University of Warsaw.


In the translation process, the original English version of the DHI was translated along with the principles of good practice carried out for the translation and cultural adaptation process for patient-reported outcome measures as defined by the International Society for Pharmacoeconomics and Outcome Research [24]. The original DHI questionnaire was translated by a speech-language pathologist and Polish philologist who is fluent in English. The translation was discussed with an experienced phoniatrician who is likewise fluent in English and then with a native speaker of English, a qualified professional translator, fluent in Polish. Items of the questionnaire were then back-translated into English and compared against the original DHI. The back-translation (from Polish back into English) was reviewed by the authors. The translation thus reconciled, and which will ultimately be used with Polish patients, was then reviewed and pilot-tested on twenty subjects with oropharyngeal dysphagia with different etiologies from the Speech-Language Pathology and Phoniatrics Clinic at the Medical University of Warsaw. The next step was to assess internal consistency using the Cronbach's alpha coefficient (Cronbach's α). The correctness of the test was determined using the Spearman rank correlation coefficient. No major corrections were made, and the results were the PL-DHI (Fig. 1.).

Fig. 1
figure 1

The Polish version of the Dysphagia Handicap Index


Inclusion criteria in the study group were (1) presence of OD deriving from any etiology, (2) age over 18, and (3) ability to independently read a written text. Dysphagia was defined as eating disturbance of the intake or transport of food from the month to the stomach. Oropharyngeal dysphagia was defined as disturbance during oral preparatory, oral transport, or pharyngeal phase of swallowing. Moreover, if there were residuals (including those in the oral cavity) that increased the risk of aspirations, we defined these subjects as OD patients [25].

Exclusion criteria were (1) inability to understand written Polish, (2) cognitive dysfunction (we excluded subjects with poor logical and verbal contact or any with cognitive dysfunction mentioned in neurological medical documentation), or (3) evidence of purely esophageal dysphagia.

After a short oral instruction (given each time by the first person—SLP), participants filled in independently questionnaire using paper and pencil, on the same day without later help. The oral instruction included a request to fill in a questionnaire by assigning three responses for each question (never, sometimes, and always), adding a value to each response (0, 2, and 4, respectively). Moreover, each participant was asked to self-rate the severity of their dysphagia on a 7-point equal appearing interval scale (visual analogue scale—VAS) anchored by the number 1 and the word ‘‘normal’’ on one end, the number 7 and the word ‘‘severe problem’’ at the other end, and the number 4 in the middle indicating a moderate swallowing problem. In the study group, patients filled in the DHI—Questionnaire after FEES examination. All participants included in the study were able to read and complete the DHI Questionnaire independently.

The patients were qualified into the study after clinical swallowing examination (carried out by SLP) and FEES examination (carried out by phoniatrician and SLP, trained during FEES Accreditation Courses) [26]. Both examinations were performed within the same week. FEES was performed using a XION flexible endoscope with a chip on tip camera and a 4 mm diameter. Swallowing was evaluated directly with nine bolus challenges, five of each consistency (liquid, puree, and solid) of approximately 2 cc volume, followed by 5 cc volume and a series of three times 5 cc volume each. The consistencies were presented as follows: five boluses of puree consistency (blue-dyed apple puree) followed by a solid consistency challenge of whole wheat bread (five pieces) and concluded with five thin liquid boluses (blue-dyed water). Patients were encouraged to feed themselves, with assistance as needed, i.e., liquid with a straw or cup and puree with a spoon. If either consistency or volume was considered unsafe to be administered or if severe swallowing efficacy impairment was observed, the FEES protocol was not completed with unsafe volumes or consistencies. Owing to safety reasons, the FEES protocol was interrupted if at least one of the following conditions occurred: (1) severe impairment of the oral control of the bolus with pureed food which led to chocking (2) and severe impairment of the oral preparatory swallowing stage with solids which prevented the processing of the solid into a bolus.

The presence and degree of airway invasion were measured using the penetration–aspiration scale (PAS) [27, 28]. Penetration was scored as present with PAS ≥ 3 ≤ 5 [29], while aspiration with PAS ≥ 6 [29] and all these patients were included in the study group. The worst bolus (i.e., the bolus with the highest PAS score) for each consistency tested and the worst PAS score among all consistencies were considered for the analysis.

Statistical Analysis

A statistical analysis was performed using the program Statistical13. For the quantitative variables, the scores were summarized using descriptive statistics (mean, standard deviation, median, and range). The reliability of the DHI was determined, examining the internal consistency and test–retest reliability. The internal consistency of the total DHI and physical, functional, and emotional subscales was evaluated using the Cronbach alpha coefficient. The distribution of each quantitative variable was checked for consistency against the normal distribution (Shapiro–Wilk test). The number of samples available for the test–retest reproducibility analysis determined the application of the r-Spearman coefficient. The nonparametric Kruskal–Wallis test was used to compare the DHI scores in the four subgroups with swallowing disorders and CG. To assess the differences between them, the nonparametric Mann–Whitney test was performed. Because of multiple comparisons, Bonferroni correction was included. The results were considered statistically significant if the p value was less than 0.05 (p < 0.05) or 0.02 (p < 0.017) in the case of multiple comparisons.



The study group was recruited from the Department of Otolaryngology at the Medical University of Warsaw from March 2016 to June 2018. Initially, 191 patients were considered for inclusion in the study group. Thirteen patients were excluded from the study (OD was excluded in 6 patients, 3 patients with the cerebrovascular bleeding presented with poor logical and verbal contact, 2 patients presented with neurofibromatosis type 2 due to amblyopia and hearing loss were unable to complete the questionnaire, and 2 patients refused to participate in the study due to malaise after the neurosurgical intervention). Finally, 178 subjects with oropharyngeal dysphagia were included in the study. The patients presented with a wide range of diagnoses, and different etiologies of swallowing disorders, including neurological disorders—study group I (stroke, Alzheimer’s, myasthenia gravis, mitochondrial myopathy), head and neck cancer—study group II (paragangliomas, free flap reconstruction, strumectomy, vocal fold paresis, partial laryngectomy), neurosurgical operations—study group III (cerebellopontin angle tumor, brain tumors), and other disorders including LPR (laryngopharyngeal reflux disorder), gastrointestinal tract disorder, chronic cough, Zencer’s diverticulum and post-cardiothoracic surgeries—study group IV (Table 1).

Table 1 Demographic characteristic

The control group consisted of 35 asymptomatic adults who had no history of any swallowing disorders or LPR, no history of head and neck surgeries, no other risk factors for oropharyngeal dysphagia, and no other chronic diseases.

The control group was recruited from individuals accompanying patients, as well as hospital staff and their family members. PL-DHI was given to everyone from the recruited control group over a period of two weeks (between 13 and 14 days). During this period, the 35 people comprising the control group did not have any swallowing intervention or medical or surgical intervention. All participants were Caucasian. Table 1 shows the demographic characteristic of SG and CG.

In analysis, we considered only patients without missing data. Table 2 shows the mean value, standard deviation (SD), median, and the range of the PL-DHI scores of the SG and subgroups and the CG. Table 3 presents the distribution of the DHI subscales and DHI total scores for SG. It is important to notice that in the control group, one patient was scored an 88 on the DHI. In this patient, VFSS examination was performed to exclude the swallowing disorders, objectively.

Table 2 Value of PL -DHI for patients and control group
Table 3 Distribution of the PL-DHI subscales and PL-DHI total scores for SG

In the study group, swallowing safety was scored as a PAS of 3 (range 1–8). In SG I, patients scored a PAS of 2.5 (range 1–8), in SG II—3 (range 1–7), in SG III—4 (range 1–8), and in SG IV—1 (range 1–7).

Internal Consistency

The internal consistency of the DHI was determined using the Cronbach alpha coefficient, with values between 0.7 and 0.8 considered as acceptable, 0.8–0.9 considered as good and values higher than 0.9 were considered to be excellent (strong). The internal consistency for the total DHI score was 0.962. A more detailed analysis of each question indicates that all of them have a similar influence on the reliability of the overall scale. The Alpha coefficient for items 1–25 ranged from 0.959 (for the item no. 14) to 0.963 (for item no. 3). The Cronbach’s alpha coefficients were also high for the DHI subscales: physical, functional, and emotional, which came in, respectively, at 0.878, 0.896, and 0.0,898. A strong Cronbach alpha coefficient, as in the study, indicates that the items are measuring the same construct (Table 4).

Table 4 Internal consistency—Cronbach's alpha results reported for each item of the DHI

Test–Retest Reliability

In order to assess DHI test–retest reliability, a randomly selected subsample of 24 subjects completed the DHI a second time, 13–14 days after the initial assessment (Table 1 presents the demographic of the study group selected for the test–retest reliability).

Test–retest reliability scores are satisfactory for total score and all DHI-PL subscales. ICC values less than 0.5 indicate poor reliability, moderate from 0.5 to 0.75, good from 0.75 to 0.9, and excellent reliability if values are greater than 0.9. The ICCagreement values for the control group were 0.974 with confidence interval CI (0.948–0.987) for the physical subscale, 0.992 for functional subscale with 0.983–0.996 CI, for emotional subscale 0.988 with 0.976–0.994 CI and 0.993 with 0.986–0.996 for the Total score. Moreover, for 24 subjects of the study group, the ICCagreement values were 0.955 with confidence interval CI (0.897–0.981) for the physical subscale, 0.857 for functional subscale with 0.670–0.938 CI, for emotional subscale 0.944 with 0.871–0.976 CI, and 0.968 with 0.927–0.986 for the total score, which indicates excellent reliability for all the DHI subscales except good reliability for functional subscale in study group. In addition, we received excellent reliability of total score in both groups.

To measure test–retest reliability in the control group, the DHI was completed and sent or brought back by the 35 participants twice within a two-week period. The median and the range of the total score of DHI were 4 (0–96) and 4 (0–98), respectively, for the DHI first and second assessments. To measure the test–retest reproducibility, the Spearman range test was used. The r-Spearman correlation coefficient for control group was r = 0.97 for the total score of DHI. For the DHI physical, functional, and emotional subscales, the r-Spearman coefficients were, respectively, 0.91, 0.86, and 0.83. This indicates a very good level of reproducibility. In case of study group, r = 0.97 for the total score of DHI. For the DHI physical, functional, and emotional subscales, the r-Spearman coefficients were, respectively, 0.90, 0.77, and 0.88. Figure 2 presents the correlations between the first and second assessment in control and study groups.

Fig. 2
figure 2

Bland–Altman plot–control and study groups

DHI-SubScales correlation

The DHI subscale scores were not normally distributed (Shapiro–Wilk test), so the correlations between the subscales were assessed using the r-Spearman correlation test (very strong correlation 0.9–1, strong 0.7–0.89, moderate 0.4–0.69, weak 0.1–0.39, and no correlation 0–0.09). The correlation was highest between the emotional and functional subscales (r-Spearman coefficient 0.835) and lowest between the physical and emotional subscales (r-Spearman coefficient 0.792). The correlation coefficient between the functional and physical was 0.834.

Construct Validity Analysis (Discriminant Validity)

The results of the Kruskal–Wallis analysis of variance for DHI showed a statistically significant difference for the considered subgroups (p < 0.001) (Fig. 3). The differences in multiple comparisons (between the CG and patients from groups I, II, III, and IV) for DHI were analyzed using a nonparametric Mann–Whitney test followed by Bonferroni correction. The results were considered statistically significant when the p value was less than 0.017. Significant differences were found between the CG and SG I to SG IV (p < 0.001). Moreover, the Mann–Whitney test was used to check whether there were statistically significant differences between the subgroups of the study group (SG). Due to multiple comparisons, Bonferroni corrections were applied, and the level significance was assumed as α = 0.008. Statistically significant differences were only found between SG I and SG II (p < 0.001) (Fig. 4).

Fig. 3
figure 3

Box plot for the control group, entire study group and for each subgroup with DHI results

Fig. 4
figure 4

Box plot for each subgroup with PAS results

For swallowing safety score, we received statistically significant differences between subgroups (Kruskal–Wallis test). Moreover, the Mann–Whitney test was used to check whether there were statistically significant differences between the subgroups of the study group (SG). Due to multiple comparisons, Bonferroni corrections were applied, and the level of significance was assumed as α = 0.003. Statistically significant differences were found only between SG II and SG IV (p < 0.001) (Table 5).

Table 5 PAS scores in subgroups of study group and comprehension of PAS scores between subgroups of the SG

Regardless of the consistency tested, 39 (26.9%) patients presented no sign of aspiration or penetration, 57 (39.3%) showed penetration (PAS 2–5), and 49 (33.8%) showed aspiration (PAS 6–8). The percentage distribution of PAS scores for each subgroup of the SG group is given in Fig. 5.

Fig. 5
figure 5

The percentage distribution of PAS scores for each subgroup of the SG group

Criterion Validity Analysis

We received statistically significant weak positive correlation between swallowing safety rated through the worst PAS score and physical (rS = 0.205 p = 0.013) functional (rS = 0.266 p = 0.001), and emotional (rS = 0.182 p = 0.029) subscales and total score of DH I(rS = 0.243 p = 0.003).


The results’ correlation of the DHI total score and VAS value is presented in Table 6. There was noticed high positive correlation between the DHI total score and VAS.

Table 6 R-Spearman coefficient between total PL-DHI score and VAS


In the last few years, the quality of life related to swallowing and voice has received a great deal of attention. But we still do not have a validated questionnaire in the Polish language to evaluate QoL related to swallowing. Focusing on that, the purpose of this study was to evaluate the validity and reliability of the PL-DHI. The number of subjects in the SG (n = 187) may be seen as a strength of the present study.

The results revealed that the PL-DHI is a reliable tool with good internal consistency and test–retest reliability. These results are similar to those for the original DHI and to the translation of DHI into other languages [16, 18, 20,21,22].

The Cronbach’s alpha coefficient for the total PL-DHI and physical, emotional and functional subscales was between 0.864 and 0.955, indicating that the PL-DHI had good internal consistency. Alpha values lower than 0.70 reflect a low correlation among the instrument items and may suggest an insufficient or poorly chosen set of items. The obtained α values could be interpreted as satisfactory.

PL-DHI internal consistency, and thus no DHI items were inserted nor deleted.

The median score for the SG was 36, with the highest SG I of 48 points in neurological patients. The SG median is significantly higher than the control group median (CG—4 points). Based on the PL-DHI score, it is possible to discriminate between healthy controls and OD patients. However, one patient in study group was scored with 0, although the presence of OD features in FEES examination. It proves that the questionnaire is still a subjective diagnostic tool. The normative data generated in the present study align with the DHI normative values through a systemic review and meta-analysis developed by Sobol et al., who reported 2.49 (0.51–4.48) as DHI mean [19].

The mean value score in PL-DHI was significantly higher (40.1 ± 27.2) than in original DHI (27.33 ± 21.18) but very similar to the Hebrew DHI (38.44 ± 24.39). Comparison of the different translations reveals a wide range of score. Table 7 presents a comparison among data of different DHI translation studies. The Japanese DHI [22] is markedly lower than all others. As Shapira [18] observed, this may be attributable to the cultural differences in the self-appreciation of dysphagia severity or to differences between the populations tested. Additionally, Ginocchio underlines that the significantly higher DHI scores reported by OD patients compared to healthy participants adequately reflect the impact of OD on patients’ health-related quality of life [30].

Table 7 Comparison among data of different DHI translation studies

For the SG in subscales: physical, functional, and emotional, the physical domain score was higher than the functional and emotional ones. Similar results were observed in the translation of DHI to other languages [16, 18, 20,21,22]. This tendency may result from the fact that symptoms from the physical subscale, including coughing, choking, and unintentional weight loss, may have the greatest impact on the quality of life.

Consistently with the Italian validation study [30], PL-DHI was weakly associated with swallowing safety.

Limitations and Future Directions

It is important to acknowledge the limitation of the assessment process. Some studies have correlated this questionnaire with other quality of life questionnaires. This study has not done that, which can be considered as limitation.

Moreover, the self-assessment questionnaire performed in this study could be correlated with FEES or VFSS which are the gold standard for diagnosis of oropharyngeal dysphagia. Such comparative studies should be undertaken in future.

Due to the diverse etiology of swallowing disorders, the division of subjects into subgroups for the purposes of the study was difficult. We considered other divisions, but none were homogeneous. Particularly, the study group 4 should be divided into smaller subgroups in future studies. There is a need to conduct further research concerning differentiation between head and neck cancer patients who received chemoradiation and those who did not, as well as those who were diagnosed with vocal fold paresis after head and neck surgery or underwent partial laryngectomy. Those patients were included in one subgroup on the basis of FEES evaluation.

Also, the study was designed at the end of 2015, so we used the Classical Test Theory to validate the PL- DHI, although, nowadays, the Item Response Theory is superior to it in validating patient-reported outcome measures [31]. Consequently, responsiveness and floor ceiling effects were not evaluated. In addition, we examined only construct validity, content, and criterion validity that were not evaluated too.

Future studies are required to validate the PL-DHI using the Item Response Theory and to examine the responsiveness of the DHI after implementation of behavioral strategies, and medical or surgical interventions.

Lastly, the cognitive decline was not measured with any scale, i.e., Mini Mental State Examination. Using this scale would objectify the exclusion criteria.


Our study demonstrates that the PL-DHI maintained its validity and reliability as a self-assessment tool for oropharyngeal dysphagia patients’ QoL for a Polish-speaking population.

This means that PL-DHI, as an easy-to-complete tool for assessing the consequences of dysphagia on the QoL, is useful not only for researchers, but also for above all patients in a clinical setting.