Patients with advanced head and neck cancer often suffer from oropharyngeal dysphagia as a result of the disease itself or its treatment [1]. Dysphagia can lead to malnutrition and dehydration as well as an increased risk of aspiration [2]. When objectifying a patient’s current health status and the effects of a therapeutic intervention, a quality-of-life instrument is considered an important evaluation tool [3].

A few questionnaires on health-related quality of life with respect to oropharyngeal dysphagia can be found in the literature: the SWAL-QOL [4], the MD Anderson Dysphagia Inventory (MDADI) [5], and the Deglutition Handicap Index (DHI) [6]. When a questionnaire is to be used for research, its psychometric characteristics must be well known and of sufficiently high quality, otherwise the study results cannot be interpreted or attributed any clinical relevance. Although the reliability and validity of the SWAL-QOL has been described [4], little is known about the psychometric quality of the MDADI or the DHI. The SWAL-QOL is an elaborate 44-item questionnaire containing 11 subscales. Although the SWAL-QOL is commonly used in research, its application in daily clinical practice is limited since clinicians need a short, easy-to-handle questionnaire for screening. In that light, the validity and reliability of the Dutch version of the DHI and the MDADI for use with oncological patients with oropharyngeal dysphagia was determined in this study.

Methods

Subjects

Patients were selected consecutively at the outpatient clinic for dysphagia at the Department of Otorhinolaryngology, Head and Neck Surgery and at the MAASTRO clinic in the Academic Hospital, both part of Maastricht University Medical Center (MUMC). Recruitment took place during visits to the outpatient clinic. A small number of patients were recruited by phone after having studied their medical records. To be included in the sample, a patient must have been diagnosed by a laryngologist as having oropharyngeal dysphagia due to oncological disorders. Furthermore, a patient’s general condition must have been stable during repeated measurements. Finally, a patient could not have any cognitive limitations. The selected patients received verbal information about the study and were included in the sample only after giving their informed consent.

In total, 76 patients were included in the study: 57 (75%) men and 19 (25%) women, ranging in age from 45 to 83 years. The mean age was 64 for men and 61 for women. The status of the oral feeding restrictions was scored using the Functional Oral Intake Scale (FOIS) of Crary et al. [7]. Two subjects were tube dependent while all other subjects were on a totally oral diet. The latter took various forms: a diet of a single consistency (N = 7), one of multiple consistencies and requiring special preparation or compensation (N = 30), one not needing any special preparation but with some food limitations (N = 28), and a normal oral diet (N = 9).

Questionnaires

This study used four questionnaires: three on quality of life related to oropharyngeal dysphagia, namely, the SWAL-QOL [4], the MDADI [5], and the DHI [6]; plus a simple one-item visual analog scale, the Dysphagia Severity Scale. Both the MDADI and the DHI were translated into Dutch by three independent researchers; their versions were combined by mutual consensus to form one final translation. The Dysphagia Severity Scale needed no translation, and the SWAL-QOL had already been translated by Bogaardt et al. [8].

The first questionnaire, the SWAL-QOL, is considered the gold standard for determining quality of life in persons with oropharyngeal dysphagia. This 44-item tool exhibits good internal-consistency reliability and short-term reproducibility [4]. It consists of 11 subscales (see Table 1). The minimum and maximum scores per subscale are zero and 100, indicating an extremely impaired quality of life (0) versus no impairment (100) as experienced by the individual.

Table 1 Descriptive analysis of the MD Anderson Dysphagia Inventory (MDADI), the Deglutition Handicap Index (DHI), the Dysphagia Severity Scale, and the SWAL-QOL

The DHI is a 30-item questionnaire on deglutition-related aspects of daily life (5-point rating scale: 0–4). The questionnaire is subdivided into three domains of ten items: emotional (psychosocial consequences), functional (nutritional and respiratory consequences), and physical (symptoms related to swallowing). The minimum scores range from zero (indicating no handicap) to 120 (indicating maximum handicap) [6].

The MDADI consists of 20 items. Besides a global assessment (a single question), it comprises three subscales: the emotional subscale (8 items), the functional subscale (5 items), and the physical subscale (6 items). The global assessment refers to the individual’s swallowing difficulty as it affects one’s overall daily routine. The emotional, functional, and physical subscales refer to the individual’s affective response to the swallowing disorder, the impact of the disorder on daily activities, and the self-perception of the swallowing difficulties, respectively [5]. Using a five-point scale (1–5), the minimum total score is 20 and the maximum 100. In the original version of the MDADI, all but two items were scored such that higher scores indicated higher functioning. In the Dutch translation, it was decided to use a uniform scoring method. Thus, by adjusting the scoring of two items, low scores came to indicate low functioning and high scores high functioning.

The Dysphagia Severity Scale is a self-designed evaluation tool consisting of one visual analog scale, quantifying the severity of the swallowing disorder and the extent of impairment experienced by the patient. A score of 100 (the maximum) indicates normal swallowing abilities, while a score of zero indicates extreme swallowing impairment or inability to swallow.

Protocol

Patients were asked to fill in all four questionnaires, either during their outpatient visit or when recruited by phone at home. Within 2 weeks after this first measurement [9], all patients received by post the MDADI, the DHI, and the Dysphagia Severity Scale for purposes of repeated measurement. The researchers made sure that all repeated measurements were sent back in time for adequate retest interval analysis [9], reminding patients if necessary by phone.

Statistical Analysis

Table 2 presents a glossary of the psychometric and statistical terms used in this study. Measurement properties of the MDADI and the DHI were determined and compared to the quality criteria as defined by Terwee et al. [10].

Table 2 Glossary of psychometric and statistical terms

First, the MDADI and DHI questionnaires were reviewed for possible floor and ceiling effects, noting the number of respondents who obtained the lowest or highest possible scores. Next, test-retest reliability was assessed by determining intraclass correlation coefficients (two-way random effects model, ICC) between repeated measurements on the MDADI, the DHI, and the Dysphagia Severity Scale. Confirmatory Maximum Likelihood (ML) factor analyses were performed to determine the number of (homogeneous) (sub)scales in each questionnaire. In addition, by computing Cronbach’s α coefficients, the internal-consistency reliability of the MDADI and the DHI was estimated. The associations among the four administered questionnaires plus the FOIS and among the subscales per instrument were determined using nonparametric Spearman’s correlation coefficients. (Sub)scales from the MDADI and the DHI that were supposed to measure the same concept were compared to determine construct validity (convergent validity). Finally, criterion validity was determined by computing nonparametric Spearman’s correlations between the SWAL-QOL (reference or gold standard) and both the MDADI and the DHI. All statistical analyses were performed using SPSS for Windows 15.0.1 (SPSS Inc., Chicago, IL).

Results

Table 1 presents the descriptive statistics for all four questionnaires. To examine a possible floor or ceiling effect, the total score of the MDADI, the total score of the DHI, and the Dysphagia Severity Scale have been visualized by means of histograms (Fig. 1a–c). These figures objectify the number of respondents who obtained the lowest or highest possible scores. As less than 15% of the respondents got the lowest or highest possible score, no floor or ceiling effect was considered to be present [10, 11].

Fig. 1
figure 1

a Data distribution on the MDADI. The number of patients is displayed as a function of the Total Score on the MDADI. The area under the curve equals the total number of patients. b Data distribution on the DHI. The number of patients is displayed as a function of the Total Score on the DHI. The area under the curve equals the total number of patients. c Data distribution on the Dysphagia Severity Scale. The number of patients is displayed as a function of the score on the Dysphagia Severity Scale. The area under the curve equals the total number of patients

To assess test-retest reliability, intraclass correlation coefficients (two-way random effects model, ICC) have been determined between repeated measurements on the total scores of the MDADI and the DHI and on the Dysphagia Severity Scale. The ICCs were 0.96, 0.94, and 0.87, respectively. A positive rating for reliability can be given only when the ICC is at least 0.70 in a sample size of at least 50 patients [10]. Because of missing values, the actual sample sizes used for ICC computation were 64 (MDADI), 35 (DHI), and 49 (Dysphagia Severity Scale). The reliability of the DHI could not be determined appropriately as a consequence of too little data. Both of the other instruments are considered to have good test-retest reliability.

Internal consistency is an important measurement property for questionnaires. It describes the extent to which items in a questionnaire (sub)scale are correlated and thus measure the same concept. For an existing theoretical model or in case the factor structure had been determined previously, confirmatory factor analysis should be applied in order to determine the number of (homogeneous) (sub)scales. To that end, a confirmatory Maximum Likelihood (ML) factor analysis has been performed using all items of the MDADI to test whether three factors could be distinguished (namely, the three subscales). However, this three-factor model was rejected (goodness-of-fit test, P < 0.000). A four-factor model, referring to the global assessment as a possible fourth factor, was rejected as well (P = 0.003). A confirmatory ML factor analysis using all items of the DHI and a three-factor model also called for rejection of the possibility of three underlying constructs or subscales (goodness-of-fit test, P < 0.000).

Still, as the subject population was rather limited, further analysis was performed to gather more information about the questionnaires’ psychometric properties. Cronbach’s α was determined because it is considered an adequate measure of internal-consistency reliability. A low Cronbach’s α (α ≤ 0.70) suggests a lack of correlation [9], whereas a high Cronbach’s α (α > 0.90) indicates redundancy of one or more items [9, 12]. Cronbach’s α was calculated separately for each (sub)scale of the MDADI and the DHI (Table 3). All Cronbach’s α values lie between 0.76 and 0.94, thus indicating good internal consistency, although some redundancy may be present. Considering the outcome of the factor analyses—no obvious homogeneous (sub)scales detected and adequate Cronbach’s α values found per (sub)scale—the internal consistency of both questionnaires seems to remain unclear [10].

Table 3 Cronbach’s α per (sub)scale of the MD Anderson Dysphagia Inventory (MDADI) and the Deglutition Handicap Index (DHI)

The associations among the four patient-administered questionnaires plus the FOIS and among the subscales per instrument were determined by nonparametric Spearman’s correlation coefficients as well (Tables 4, 5). For the correlation coefficients (R), a minimum value for a strong correlation was set at 0.7 [1315]. Correlation coefficients between 0.3 and 0.7 were considered a substantial correlation, and R values less than 0.3 were considered a weak correlation. Negative correlations are expected because all questionnaires except the DHI associate lower scores with more severely impaired quality of life or restricted functional oral intake. Correlations between the quality-of-life instruments and the functional feeding status proved low (−0.013 ≤ R ≤ 0.53). Construct validity could be determined by comparing the (sub)scales from the MDADI and the DHI that were supposed to measure the same concept. Associations between similar subscales from both questionnaires as well as both total scores demonstrated whether they defined the same target construct (convergent validity). Correlation coefficients for the emotional, functional, and physical subscales from the MDADI and the DHI were −0.93, −0.65, and −0.62, respectively. The correlations between the Dysphagia Severity Scale and both total scores from the MDADI and the DHI were rather low (0.45 and −0.52, respectively), whereas the correlation between both total scores of the MDADI and the DHI was strong (R = −0.87). The mean correlation coefficients between the subscales of the MDADI and between the subscales of the DHI were 0.80 (0.66 ≤ R ≤ 0.82) and 0.60 (0.54 ≤ R ≤ 0.66), respectively.

Table 4 Associations among the MDADI, the DHI, the Dysphagia Severity Scale, and the FOIS (nonparametric Spearman’s correlation coefficients)
Table 5 Associations among the SWAL-QOL versus the MDADI, the DHI, the Dysphagia Severity Scale, and the FOIS (nonparametric Spearman’s correlation coefficients)

When considering the SWAL-QOL as the reference standard or gold standard, the extent to which the MDADI and the DHI agreed or correlated with the SWAL-QOL could be defined as the questionnaires’ criterion validity. Table 5 presents the associations among the SWAL-QOL versus the MDADI, the DHI, the Dysphagia Severity Index, and the FOIS (nonparametric Spearman’s correlation coefficients). The mean correlation coefficients for the subscales from the SWAL-QOL versus the total score of the MDADI, the total score of the DHI, and the Dysphagia Severity Scale were 0.67 (0.39 ≤ R ≤ 0.86), −0.61 (−0.38 ≤ R ≤ −0.80), and 0.36 (0.30 ≤ R ≤ 0.73), respectively. Next, based on the authors’ clinical experience, subscales that were considered to be of lesser importance to oropharyngeal dysphagia were excluded by mutual consensus. Thus, when excluding the subscales Fear, Sleep, Fatigue, and Communication, the mean correlation coefficients as determined for this restricted group of subscales were 0.76 (0.62 ≤ R ≤ 0.86), −0.71 (−0.60 ≤ R ≤ −0.80), and 0.42 (0.31 ≤ R ≤ 0.73), respectively. According to Terwee et al. [10], the correlation with the reference standard needs to be at least 0.70. Only after having excluded the less relevant subscales of the SWAL-QOL did both the MDADI and the DHI show satisfactory associations with the reference standard.

Discussion

In this study, the psychometric characteristics for the MDADI and the DHI have been determined. The Dysphagia Severity Scale was introduced to reveal any advantages or disadvantages of using elaborate questionnaires compared to using a simple visual analog scale, while the SWAL-QOL was considered the reference or gold standard. None of the quality-of-life questionnaires showed any floor or ceiling effect. The test-retest reliability of the MDADI and the Dysphagia Severity Scale proved to be good. However, because too much data were missing for the DHI, its test-retest reliability could not be determined, although the intraclass correlation coefficients were rather high. The internal consistency using Cronbach’s α seemed to be good. However, when applying confirmatory factor analysis, the underlying constructs as defined by the subscales per questionnaire could not be distinguished. Probably because of unclear constructs, only the two emotional subscales were strongly correlated, whereas the associations between the other corresponding subscales were just moderate. Overall, the Dysphagia Severity Scale showed rather low correlations with the other three questionnaires. It seemed that a detailed questionnaire could not be replaced by a single one-item scale quantifying the severity of the swallowing disorder. The concepts being measured proved to be different. When considering the criterion validity, the MDADI and the DHI showed satisfactory associations with the SWAL-QOL after having removed its less relevant subscales.

Considering both the MDADI and the DHI, it is concluded that neither of these two questionnaires will generate perfect psychometric data. While striving to use questionnaires with the most optimal properties, the ultimate choice will be made by future researchers themselves. Depending on the purposes of their studies, they may choose the somewhat elaborate SWAL-QOL or one of the other two questionnaires with reasonable (though not perfect) psychometric characteristics. Another solution might be to develop a new quality-of-life questionnaire.

Conclusions

In conclusion, when assessing the validity and reliability of the Dutch version of the MDADI and the DHI, not all criteria for psychometric properties have been adequately met. In general, the importance of determining these characteristics and of objectifying concepts such as validity and reliability must be stressed when developing a questionnaire. If a questionnaire’s quality proves to be poor, the study results cannot be interpreted correctly nor can any clinical relevance be determined. Therefore, it is recommended that future outcome studies should use only quality-of-life questionnaires that have sufficiently good psychometric characteristics.