Introduction

Fatigue is one of the most common symptoms cancer patients experience [1, 2]. Cancer survivors are also often affected by this burdensome symptom [3]. In contrast to normal tiredness, fatigue cannot be relieved by common strategies known to restore energy [4]. In addition to cancer, heightened levels of fatigue can also be found in patients with other diagnoses such as cardiovascular diseases [5], lung diseases [6], and rheumatoid arthritis [7]. Multiple questionnaires have been developed to assess fatigue. A review paper summarizes 40 such instruments [8]. One often used instrument is the Multidimensional Fatigue Inventory MFI-20. This questionnaire was developed in 1995 in the Netherlands [9] and has been translated into English and many other languages, e.g., German [10], French [11], Swedish [12, 13], Spanish [14], Korean [15], Hindi [16], and Chinese [17]. General population studies have been performed with the MFI-20 in several countries [13, 14, 18]. There is a long-standing debate about the factorial structure of the MFI-20. Several examinations failed to confirm the five-dimensional structure (general fatigue, physical fatigue, reduced activity, reduced motivation, mental fatigue) that was proposed by the original test authors [11, 19,20,21]. A French study [19] retained all 20 items and combined two of the dimensions into one thereby resulting in four dimensions. Another study [11] restricted the analysis to 15 items and arrived at four dimensions. A Polish study also removed five items and assigned the remaining items to only three factors. The Swedish general population study [13] did not change the factorial structure, but the researchers only reported the results of four out of the five scales, the subscale reduced motivation was ignored due to bad psychometric properties. A study done among 1494 German cancer patients restricted the analyses to one out of the five dimensions, namely, general fatigue [22]. One putative reason for the poor fit indices is the fact that all MFI-20 scales include both two positively oriented and two negatively oriented items. Using items of different orientation can reduce the reliability of the scales [23, 24]. Therefore, Baussard et al. [25] developed a shortened version of the questionnaire, the MFI-10. It consists only of those 10 items which positively indicate fatigue and excludes the items with the opposite direction.

The main objectives of this paper were to test the MFI-20 in a large sample of German cancer patients and to examine factors associated with fatigue. In particular, the aims were (a) to examine cancer patients’ levels of fatigue—compared with those of the general population, (b) to analyze the impact of age, gender, and clinical factors (tumor site, time since diagnosis, ECOG performance status, presence of metastases, and setting) on fatigue, (c) to examine the relationship between the fatigue dimensions and other scales of quality of life (QoL), and (d) to test psychometric properties of the MFI-20 and the short-form MFI-10.

Methods

Cancer patients

This multicenter study included cancer patients receiving treatment in acute care hospitals, outpatient facilities, and rehabilitation clinics. The aim was to obtain as representative as possible a sample of German cancer patients. Five study centers in Germany contributed to this total project; three of them also included the MFI-20. The following institutions were involved at each study center: the local university hospital, at least one other maximum care hospital, at least two ambulatory facilities, and at least two rehabilitation clinics. Further details of the study methods have been described elsewhere [26]. Results of this study concerning the prevalence of mental disorders have already been published [27, 28]. Inclusion criteria were the presence of a malignant tumor and age between 18 and 75 years. Study candidates were excluded if they had a severe physical, cognitive, and/or verbal impairment that would interfere with their ability to give informed consent. Trained research assistants contacted the patients who fulfilled the inclusion criteria and asked them to participate. All participating patients provided written informed consent. The response rate was 68.1%. While the whole study comprised 4020 patients, in three of the five participating study centers (Leipzig, Hamburg, and Freiburg), some further questionnaires were included in addition to the core questionnaires which were administered to all patients. One of these additional questionnaires was the MFI-20. Therefore, MFI-20 data sets are available for a subsample of the 4020 patients. The study was conducted in accordance with the Declaration of Helsinki and was approved by the ethics committees of all participating centers.

General population

The general population sample was derived from a general population survey [18]. Starting from more than 200 sampling points representing all regions of Germany, street, house, and flat were chosen via the random-route technique. Finally, the target person in the household was also selected randomly using the Kish-selection-grid technique. The response rate of this examination was 68%. The sample was fairly representative of the general population of Germany in terms of age, gender, and education. In total, the sample of the general population comprised n = 1993 people in the age range 18–93 years, 874 of whom were males and 1119 females.

Since the average age of the cancer patients’ sample was higher than that of the general population sample, we selected a subsample of the 1993 persons so that the age and gender distributions of that group were nearly identical to those of the patients. This was achieved by successively removing younger participants and females until the man age and gender distribution of the cancer patients was reached. The selected subsample comprised 1397 persons (630 males and 767 females; proportion of females, 54.9%) with a mean age of 58.5 years, which is nearly identical to the data points of the patients’ sample.

Instruments

MFI-20

The MFI-20 consists of 20 items which belong to 5 dimensions. Each item has to be answered on a five-point Likert scale (range 1–5); the scale scores range from 4 to 20. An item example is: “I feel very active.” Each scale consists of two positively oriented items and two negatively oriented items. Although the authors of the original test did not recommend calculating a total score over all 20 items, it is possible to use such a sum score [29]. The shortened MFI-10, according to Baussard et al. [25], consists of the 10 positively oriented items (items 2, 5, 9, 10, 13, 14, 16, 17, 18, and 19) which indicate the presence of fatigue and omits those negatively oriented items which indicate the absence of fatigue. Though this MFI-10 can also be decomposed into three subscales [25], we use the MFI-10 as a 10-item scale in the descriptive analyses.

EORTC QLQ-C30

The QoL questionnaire EORTC QLQ-C30 [30] consists of 15 scales: five functioning scales (physical, role, cognitive, emotional, and social functioning), eight symptom scales, one item concerning financial difficulties, and one 2-item global QoL scale. One of the symptom scales is the three-item fatigue scale, an example item is: “Were you tired?” High functioning scores and low symptom scores indicate good QoL. It is also possible to use a summarizing score that averages across all functioning scores and all (inverted) symptom scores according to Giesinger et al. [31]. Normative values of the EORTC QLQ-C30 are available [32, 33].

Statistical analyses

Effect sizes d were used to express the mean score differences between patients and the general population in relation to the standard deviations. A two-way ANOVA was performed to test the influence of age group (five categories) and gender on fatigue in the patients’ sample. The effects of clinical setting, tumor stage, metastases, and ECOG performance state on fatigue were tested with three-way ANOVAs with the cofactors age group and gender. Cronbach’s alpha was chosen to indicate the reliability of the scales. The associations between the fatigue scales and other dimensions of QoL were tested with Pearson correlations. Confirmatory factor analyses (CFAs) were calculated with Mplus. We tested the one-dimensional and the originally designed five-dimensional model of the MFI-20. In addition, we tested the short form MFI-10, also in terms of a one-dimensional model and a three-dimensional model according to Baussard et al. [25]. Several fit indices were examined to evaluate the overall fit of each model: The χ2 goodness-of-fit statistic, the comparative fit index (CFI), the Tucker–Lewis index (TLI), the standardized root mean square residual (SRMR), and the root mean square error of approximation (RMSEA) according to Hu and Bentler [34]. The statistical analyses, except the CFAs, were performed with SPSS version 24.

Results

Sample characteristics

Because the MFI-20 was administered in only three of the study centers, only 1824 of the 4020 patients filled in the MFI-20, at least in part. We restricted the analyses to those participants who had at least three valid scores for each of the five scales. This resulted in 1818 patients with valid MFI-20 scores (Table 1).

Table 1 Characteristics of the cancer patients’ sample

Comparison between patients and the general population

Figure 1 presents the MFI-20 mean scores (sum scores) for the cancer patients and the general population. While in the general population there is a clear increase with increasing age, no such age trend was detected among the patients. In the age range 71 years and above, there are only marginal differences between the patients and the general population.

Fig. 1
figure 1

Mean scores of the MFI-20 total score for patients and general population, broken down by gender and age

Mean scores of the subscales and reliability coefficients

The mean scores of the five subscales, the MFI-10 total score, and the MFI-20 total score are given in Table 2. Female patients showed slightly higher fatigue mean scores (total M = 56.0) than males did (total M = 53.3). A comparable gender difference was found for each subscale of the MFI-20; the highest gender differences were found for the general fatigue and mental fatigue subscales. The ANOVA results reflecting the impact of gender and age group on the MFI-20 total score for the patients’ group were as follows: gender: F = 6.3, p = 0.012, age group: F = 0.599, p = 0.663, and interaction gender * age group: F = 0.418, p = 0.796.

Table 2 Mean scores of the MFI-20 scales including the sum scores of MFI-20 and MFI-10, comparison between cancer patients and general population, and reliability coefficients

The patients reported higher levels of fatigue than the general population did on all subscales. However, the differences were small in magnitude for reduced motivation (d = 0.09) and mental fatigue (d = 0.35), while the effect sizes were higher than 0.60 for the other three subscales (Table 2). The reliability coefficients of the subscales were between 0.71 and 0.87; the reliability of the MFI-20 total score was highest with alpha = 0.94 (Table 2).

Clinical factors and fatigue

Table 3 presents the MFI-20 mean scores for the cancer types. The highest burden of fatigue was found among patients suffering from cancers of the blood and blood-forming organs, the skin, the category eye, brain, CNS, and mesothelial and soft tissue. Relatively low fatigue scores were reported by patients suffering from cancers of the male genital organs and breast cancer.

Table 3 The impact of tumor localization on fatigue

The impact of the clinical setting (inpatient, outpatient, and rehabilitation), tumor stage, the presence of metastases, and the ECOG status on fatigue is given in Table 4. Fatigue increased with higher cancer stages, the presence of metastases, and higher scores in the ECOG level.

Table 4 The impact of the clinical setting and clinical variables on fatigue

Correlations between MFI-20 and other QoL scales

Among the 13 scales of the EORTC QLQ-C30, the three-item fatigue scale showed the highest correlation coefficients (between 0.49 and 0.77) with the MFI-20 scales (Table 5). Since the three items of the EORTC QLQ-C30 fatigue scale mainly indicate physical fatigue, the correlations with the MFI-20 scales general fatigue and physical fatigue are the highest. Of the five MFI-20 subscales, in most cases, general fatigue and physical fatigue showed the highest coefficients. Relatively weak associations were observed for the last two subscales, reduced motivation and mental fatigue, with the exception of the high correlation between cognitive functioning and mental fatigue (r = − 0.71).

Table 5 Correlations between the MFI-20 scales including the sum scores of MFI-20 and MFI-10 and the scales of the EORTC QLQ-C30

When comparing the coefficients for the MFI-10 and the MFI-20 total scores, nearly all coefficients of the MFI-20 were slightly higher, in most cases with a difference of between 0.02 and 0.03.

Factor analyses

Table 6 presents the CFA results. The one-dimensional model of the MFI-20 (model 1) shows the weakest fit coefficients. Considering the five dimensions (model 2) results in a remarkable improvement of the fit, though the coefficients CFI and TLI did not reach the thresholds for good model fit. The one-dimensional model of the MFI-10 (model 3) yielded better fit indices than the one-dimensional MFI-20. The last row in Table 6 takes into account the factorial structure of the MFI-10 (model 4); the resulting fit indices are marginally better than those of the MFI-20 when the five factors are taken into account (model 2).

Table 6 CFA fit indices

Discussion

The first aim of this study was to determine the burden of fatigue experienced by cancer patients. The cancer patients’ fatigue level was markedly higher than that of the general population. The effect size of the MFI-20 total score difference between the patients and the general population was d = 0.58, well above the criterion given by Norman et al. [35] who proposed adopting half a standard deviation (d = 0.50) as a criterion of clinical significance. Figure 1 clearly shows that there is a difference between the patients and the general population in terms of the effect their age had on their levels of fatigue. While we observed a clear link between increase of fatigue and increasing age in the general population, there was no statistically significant age effect in the patients’ sample. Clinicians should be aware that young cancer patients in particular suffer from fatigue compared with their healthy peers and that they need special support in the treatment of fatigue. This phenomenon, a nearly linear increase in the general population and small age effects in the patients, is not cancer-specific; it can also be found in other groups of patients [36]. However, in the Colombian general population study [14] the age trend was weaker than in the German normative study, and in the Swedish general population study [13], the age trend did not occur at all.

Concerning the five dimensions of the MFI-20, the most reliable and valid scales were general fatigue, physical fatigue, and reduced activity. The reliability of these scales was high with alpha coefficients above 0.80, and the differences between the patients and the general population (d > 0.60) were also the greatest on these scales. The lowest contribution was obtained from the scale reduced motivation which had an insufficient alpha coefficient and revealed only marginal differences between the groups (patients and general population).

Even though the MFI-20 with its five dimensions already covers a relatively broad spectrum of fatigue, qualitative studies show further characteristics of this issue. A recent meta-analysis of qualitative studies [37] identified six constructs in the sense of new interpretations of fatigue: embodied experience, (mis)recognition, small horizon, role changes, loss of self, and regaining one’s footing. Nevertheless, among the existing fatigue scales, the MFI-20 is one of the best at approaching these additional facets of fatigue.

The MFI-20 is well-suited to test the effects of acquiescence and response style because of the balanced proportions of positively and negatively oriented items [24]. The common variance of the equally oriented items is not reflected in the scale structure. Therefore, the differences in the item orientation result in a certain degree of unexplained variance which reduces the reliability of the scales. Therefore, it is interesting to test the MFI-10 which was constructed to omit such wording effects. The coefficients of the MFI-10 (reliability, effect sizes for the comparison between patients and general population, and correlations with the scales of the EORTC QLQ-C30) were slightly lower than those of the MFI-20, but they remained within an acceptable range. The shortening of the 20-item instrument to the 10-item version seems to be a good alternative for clinicians who are interested in using a shorter instrument.

As in other studies [10, 21, 38], the CFA fit indices of the MFI-20 were not satisfying. We did not create a new factorial structure for the MFI-20 as was done by other researchers since we believe that it is not useful to postulate new factors which make the results of different studies incomparable. The MFI-10 was however worth testing. The fit coefficients of the one-dimensional MFI-10 were better than those of the one-dimensional MFI-20, a result which can be interpreted as a consequence of having removed all of the items with an opposite direction [24]. Taking into account the subscale structure yielded better fit indices than those obtained with the one-dimensional models; the fit coefficients of the MFI-20 and the MFI-10 were of similar magnitude when the subscale structures were taken into account. Nevertheless, the reliability (Cronbach’s alpha) of the MFI-20 total score (0.94) was very good and higher than the coefficients of the MFI-10 total score (0.89) and the subscales. Moreover, the correlations of the MFI-20 total score with the scales of the EORTC QLQ-C30 were higher or at least as high as those of the MFI-20 subscales. Even when researchers and clinicians acknowledge that fatigue is a multidimensional construct, they are nevertheless often interested in having a summarizing score for fatigue. In such cases, both the MFI-20 total score and the MFI-10 total score are suitable measures.

There are no generally accepted cutoff scores for the MFI-20. In two studies, cutoff scores were used which were derived from a general population sample under the assumption that heightened fatigue means a score above the 75th percentile of the corresponding age and gender group [22, 29]. However, since this criterion is somewhat arbitrary, we preferred not to express the degree of fatigue in terms of persons above such a cutoff.

The impact of tumor type on fatigue is presented in Table 3. The lowest fatigue levels were found for cancers of the male genital organs, mostly prostate cancer. This result has also been found in several other studies [39]. Since the overall gender differences in fatigue were small in magnitude (d = 0.15), this effect can only partly be accounted for by the male gender factor.

The clinical setting (inpatient, outpatient, and rehabilitation) only had a small impact on the patients’ fatigue levels. As such, one can justifiably compare fatigue assessments obtained in these varying settings. As was to be expected, tumor stage, the presence of metastases, and the ECOG performance score were clearly associated with fatigue. With the exception of one subscale, mental fatigue, all of the MFI-20 subscales contributed to these differences. Thus, it is justifiable to evaluate the impact of these factors on the basis of a fatigue total score.

Some limitations of this study should be mentioned. It is possible that there was a certain selection bias. Patients suffering from severe fatigue might be underrepresented, which means that our mean scores might underestimate the actual burden of fatigue present in this patient group. The data of the general population comparison group is from 2003; however, since then, no normative study has been performed in Germany. We only tested the most important models with CFAs. We could have also tested other models proposed in the literature, e.g., three- or four-dimensional solutions. In addition, we could have calculated bifactorial models, including the total factor as well as the five single factors, which generally yield better fit indices. When we analyzed the impact of tumor site and other clinical variables on the fatigue levels, we only used bivariate statistics. Tumor type and other variables such as tumor stage may be correlated and confounded with age and gender.

In summary, fatigue is a severe problem among cancer patients. The MFI-20 proved to be an appropriate instrument for measuring fatigue, and the MFI-10 is a good alternative for clinicians interested in using a shorter questionnaire.